This Blog Provide Topics, Abstracts, Documentations, Slides for various Seminars, Projects, Paper Presentations. After Reading Abstract You Can Download Corresponding Paper By Clicking The Link Given At The Bottom. On The Right Side Bar Select Your Branches CSE, ECE, EEE, IT, MCA, MBA, Civil, Mechanical Departments And More Stuff Will Be Added From Time To Time. So Please Be In Touch With This Blog For More And Apt Information.
|Speech Compression| |Data Security| |Artificial Neural Networks| |Moletronics| |AI Speech Recognition| |ATM| |Blue Eyes| |Brain Computer Interface| |Fuzzy Logic| |Mobile Voting| |Information Security Using Steganography| |Modern Irrigation Systems| |Asynchronous Chip| |Smartphone| |Gizmag|Subtractive Synthesis | Spread Spectrum | Speech Compression | Paper Batteries | Satellite Encryption | Robotics 1 2 | Silicon in Nanotechnology | Renewable Energy Systems | Reed Solomon Code | Vlsi Paper Presentation | Green Nanotechnology | Aerospace Nanotechnology | Nanotechnology | Brain Controlled Car 1 | Bubble Power | Brain Machine Interface | Beam Robotics Nervous Systems | Artificial Photosynthesis | Neural Networks | Adaptive Filtering | Finger Print Recognizer | Vlsi Chip | Digital Water Marking |
Distributional Features for Text Categorization
Text categorization is the task of assigning predefined categories to natural language text. With the widely used “bag-of-word” representation, previous researches usually assign a word with values that express whether this word appears in the document concerned or how frequently this word appears. Although these values are useful for text categorization, they have not fully expressed the abundant information contained in the document. This paper explores the effect of other types of values, which express the distribution of a word in the document. These novel values assigned to a word are called distributional features, which include the compactness of the appearances of the word and the position of the first appearance of the word. The proposed distributional features are exploited by a tfidf style equation, and different features are combined using ensemble learning techniques. Experiments show that the distributional features are useful for text categorization. In contrast to using the traditional term frequency values solely, including the distributional features requires only a little additional cost, while the categorization performance can be significantly improved. Further analysis shows that the distributional features are especially useful when documents are long and the writing style is casual.
Existing System:-
• Here text categorization is used to identify the frequency of the appearance to characterize a word.
• This text categorization is not enough for fully capturing the information contained in a document.
Proposed System:-
• Distributional features for text categorization are designed.
• Distributional features can help improve the performance.
Hardware Specification:
Processor : Pentium Iv 2.6 Ghz
Ram : 512 Mb Dd Ram
Monitor : 15” Color
Hard Disk : 20 Gb
Software Specification:
Front End : Java, Swing
Tools Used : JBuilder
Operating System : WindowsXP