From a trained model in a notebook to a live web app anyone can use.
Libraries: TensorFlow, Keras, Pandas, Streamlit β’ Estimated Time: 3 hours
So far, all our amazing models have lived inside a Colab notebook. They can't do anything unless we are there to run the code. The final, crucial step in any real-world AI project is deploymentβpackaging your model into a user-friendly application.
Today, we'll build a simple web app for our sentiment analysis model. This means anyone, even someone with no coding knowledge, can visit a webpage, type in a sentence, and get a prediction.
Building a web app from scratch usually involves learning HTML, CSS, and JavaScript frameworks. It's complicated! Streamlit is a magical Python library that lets you build beautiful, interactive web apps for machine learning with just a few lines of Python. No web development experience needed!
Before we can build an app, we need a model. We'll train an LSTM on a real-world dataset of tweets to classify them as positive or negative. The most important new step is saving our trained model and tokenizer so our web app can use them later.
We'll use the "Sentiment140" dataset, which contains 1.6 million tweets with sentiment labels (0 = negative, 4 = positive).
Before diving in, it's always good practice to inspect your data. Use `df.head()` to see the first few rows and `df['sentiment'].value_counts()` to see if the number of positive and negative tweets is balanced.
This is the same process as Lab 9: convert text to numbers.
A key skill is diagnosing your model's training. The `history` object contains the accuracy and loss for each epoch. Use Matplotlib to plot the training accuracy vs. the validation accuracy. Is the model overfitting? (Hint: `history.history['accuracy']` and `history.history['val_accuracy']`)
To use this model in another script (our web app), we must save two things: the trained model's weights and the tokenizer. The tokenizer is just as important as the model, because our app needs to preprocess user input in the exact same way.
After running this, use the file browser in Colab to download `sentiment_model.h5` and `tokenizer.pickle` to your computer. You'll need them for the next part.
Now we leave Colab and move to our own computer. Create a new folder on your desktop, and place the two files you just downloaded (`sentiment_model.h5` and `tokenizer.pickle`) inside it.
Open a terminal or command prompt on your computer and run:
Inside your new folder, create a new Python file named `app.py`. Open it in a text editor (like VS Code, Sublime Text, or even Notepad) and paste the following code:
Go back to your terminal, make sure you are in the folder where `app.py` is located, and run this command:
Your web browser should automatically open with your new application running! Try it out!
Model prediction isn't instant. To improve the user experience, wrap the prediction logic in a `with st.spinner('Analyzing...'):` block. This will show a nice loading animation while the model is working.
Instead of just printing the score, use a visual element. After the `st.success` or `st.error`, add `st.progress(sentiment_score)` to show a progress bar. Or, try `st.bar_chart({'sentiment': [sentiment_score, 1-sentiment_score]})` to show both positive and negative probabilities.
Your base application works, but we can make it better. Your mission is to add new features to `app.py`.
The model we built was simple to keep training fast. A more powerful and modern approach is to use pre-trained word embeddings like GloVe.
This is a classic binary sentiment classification dataset, but the text is much longer and more complex than tweets.
Your goal is to build a highly accurate sentiment classifier for this dataset and deploy it with Streamlit.
Since this project involves multiple local files, submitting a Colab link isn't enough.