Kaustubh Mishra's Journey: From NIT Silchar to Fellowship.ai
Table of Contents
- Introduction
- Background
- H2.1 Early Education
- H2.2 Pursuing Master's
- Motivation for Fellowship.ai
- H3.1 Desire to Contribute
- H3.2 Industrial Exposure
- H3.3 Networking and Learning Opportunity
- NLP Challenge Submission
- H4.1 Choice of Platform: Kaggle
- H4.2 GitHub Repository
- Data Cleaning Process
- H4.1 Removal of HTML Sequences and Punctuations
- H4.2 Standardizing Reviews
- Model Development
- H4.1 LSTM Layer Architecture
- H4.2 Utilization of Embedding Vectors
- Training and testing
- H4.1 Accuracy Achieved
- H4.2 Utilization of Pre-trained WORD Vectors
- Ways to Improve the Model
- H4.1 Utilization of Attention Layers
- H4.2 Sentence Embeddings
- H4.3 Data Augmentation for Regularization
- Conclusion
- Highlights
Introduction
Hey there! 👋 Meet Costa Mishra, a passionate individual pursuing a Master's in computer engineering from New York University. Here's a glimpse into his journey, aspirations, and his recent venture into the realm of Fellowship.ai.
Background
Early Education
Costa embarked on his educational journey from the bustling city of Lucknow, India. His academic pursuits led him to complete his undergrad in electronics and instrumentation engineering from NIT Silchar.
Pursuing Master's
Currently, Costa is immersing himself in the dynamic environment of computer engineering at New York University, where he's delving into the intricacies of machine learning and deep learning projects.
Motivation for Fellowship.ai
Desire to Contribute
Driven by an ardent desire to tackle real-world challenges, Costa seeks avenues to make a tangible impact. Fellowship.ai resonates with his career aspirations, offering a platform to Channel his enthusiasm into Meaningful endeavors.
Industrial Exposure
While Costa has honed his skills through academic projects, he craves real-world exposure to refine his expertise. Fellowship.ai presents an opportunity to bridge the gap between theory and practice, equipping him with invaluable industrial insights.
Networking and Learning Opportunity
Beyond technical proficiency, Costa values the collaborative spirit of Fellowship.ai. Interacting with like-minded individuals and mentors promises a rich learning experience, fostering personal growth and skill enhancement.
NLP Challenge Submission
Choice of Platform: Kaggle
Costa's journey into natural language processing led him to Kaggle, drawn by its robust infrastructure and GPU availability. Leveraging Kaggle's resources, he embarked on a sentiment analysis project using the IMDb 50k movie reviews dataset.
GitHub Repository
To showcase his project, Costa curated a comprehensive GitHub repository housing his notebook and pertinent resources. This repository serves as a testament to his dedication and proficiency in NLP.
Data Cleaning Process
Removal of HTML Sequences and Punctuations
Costa meticulously cleaned the dataset, eliminating irrelevant HTML sequences and punctuation marks to enhance data coherence and Clarity.
Standardizing Reviews
By standardizing reviews and removing redundant words, Costa ensured consistency in his dataset, laying a solid foundation for subsequent analysis.
Model Development
LSTM Layer Architecture
Costa's model architecture comprised a single LSTM layer with 128 cells, augmented by embedding vectors to achieve impressive accuracy in sentiment analysis.
Utilization of Embedding Vectors
Drawing from pre-trained word vectors, Costa enriched his model's understanding of textual data, enhancing its predictive capabilities.
Training and Testing
Accuracy Achieved
Costa attained commendable accuracy rates, ranging from 87% to 89%, underscoring the efficacy of his model in discerning sentiment from movie reviews.
Utilization of Pre-trained Word Vectors
By harnessing pre-trained word vectors from the GloVe dataset, Costa bolstered his model's linguistic comprehension, facilitating nuanced analysis.
Ways to Improve the Model
Utilization of Attention Layers
Costa envisages incorporating attention layers to enhance his model's interpretability and ability to capture nuanced semantic information.
Sentence Embeddings
Exploring techniques like sentence embeddings, Costa aims to capture inter-sentence relationships, enriching his model's contextual understanding.
Data Augmentation for Regularization
To foster model stability and generalization, Costa advocates for data augmentation techniques, providing regularization and mitigating overfitting risks.
Conclusion
In conclusion, Costa extends his gratitude for the opportunity to showcase his endeavors. He eagerly anticipates the prospect of contributing to the Fellowship.ai community, driven by a relentless pursuit of knowledge and growth.
Highlights
- Costa Mishra's journey from NIT Silchar to New York University.
- Motivation to join Fellowship.ai: Desire for real-world exposure and collaborative learning.
- NLP Challenge Submission: Sentiment analysis project on Kaggle.
- Model Development: Utilization of LSTM layers and pre-trained word vectors.
- Strategies for Model Improvement: Incorporation of attention layers and data augmentation techniques.
FAQ
Q: How did Costa choose Kaggle for his NLP project?
A: Costa was drawn to Kaggle for its robust infrastructure and GPU availability, facilitating efficient experimentation and model development.
Q: What motivated Costa to pursue Fellowship.ai?
A: Costa's desire for real-world exposure and collaborative learning opportunities fueled his interest in Fellowship.ai, aligning with his career aspirations.