I am a 4th year B-Tech student at IIIT Hyderabad, pursuing B.tech and MS (dual degree) in Computer Science and Engineering. I work in LTRC under the guidance of Dr. Rajeev Sangal. My area of interest is NLP.

Research Publications

Sudheer, Prashanth, Manish, Samar “Experiments with Malt Parser for parsing Indian Language” (ICON-TOOL Contest, 2010)


The courses I have taken in IIIT-H include the following:
  • Data Structures
  • Algorithms
  • Computer Networks
  • Artificial Intelligence
  • Natural Language Processing
  • Computational Linguistics
  • Information Extraction and Retrieval
  • NLP Applications
  • Pattern Recognition

Major Projects

Question Generation for Discourse Cues We present a system that automatically generates questions from natural language text using explicit discourse connectives. We explore the usefulness of the explicit discourse connectives for question generation (QG) that looks at the problem beyond sentence level. Our work divides the QG task into sense disambiguation of the discourse connectives, identification of question type, finding the relevant part to frame question from and syntactic transformations. The system is evaluated manually for syntactic and semantic correctness.
Technologies used: Java

Automatic Gap-fill Question Generation from Text Books Automatic generation of gap-fill questions is an interesting NLP task useful for building many educational applications. We present an automatic gap-fill question generation system for chapters of text books. Our system uses gap-fill questions for covering course content effectively. Our approach follows three steps (1) sentence selection (2) keyword selection and (3) distractor selection for generation of gap-fill questions.System's manual evaluation is done by two domain experts. We intend to apply our existing system on text books of various domains.
Technologies used: Java

Statistical Machine Translation To make a language independent Machine Translation System using a discriminative approach, currently experimenting with different tools like SVM, Max-Ent
Technologies used: Python

Parse Ranking Selecting the best parse out of many different parses given by parser like CBHP (constraint based hybrid parser)
Technologies used: Python

An Iterative approach to improve state of art of POS tagging, Chunking, NER tagging For solving problem of data sparseness for POS tagging and Chunking I have used a iterative approach , using chunking information again in POS tagging and so on for hindi language with 3 different tools, CRF, TnT , MaxEnt. Same procedure I have used for NER – Chunking
Technologies used: Python

Developing a search engine for the entire Wikipedia Dump and extracting and Populating missing information boxes To create a web search engine for the entire Wikipedia dump. It involved indexing around 24 GB of data and building a keyword based search engine.
Technologies used: Java