Carview!

Todo for the project

1. Package Installation and Imports:

The code starts by installing and importing the necessary packages, including TensorFlow, pandas, scikit-learn, numpy, regular expressions, NLTK, Matplotlib, and the Hugging Face Transformers library.

2. Preprocessing and Cleaning Functions:

Several functions are defined for preprocessing and cleaning text data, including removing stopwords, short words, special characters, and converting text to lowercase.

3. Reading and Cleaning the Dataset:

Reads a dataset from a CSV file, drops unnecessary columns, removes NaN values, and shuffles the dataset.

4. Loading DistilBERT Tokenizer and Model:

Loads the DistilBERT tokenizer and model from the Hugging Face Transformers library.

5. Preparing Input for the Model:

Sets the maximum length for input sentences.
Tokenizes and encodes sentences using the DistilBERT tokenizer.
Prepares input sentences, attention masks, and labels for model training.

6. Creating a Basic NN Model Using DistilBERT Embeddings:

Defines a neural network model that uses DistilBERT embeddings.
The model includes a Dense layer, Dropout layer, and output layer.

7. Saving Model Input in Pickle Files:

Saves the model input (input_ids, attention_masks, labels) into pickle files for later use.

8. Train-Test Split and Model Compilation:

Splits the data into training and validation sets.
Defines the loss function, metrics, and optimizer for the model.
Compiles the model.

9. Training the Model:

Trains the model on the training data, validating on the validation set.
Saves the best model based on validation loss.

10. Tensorboard Visualization:

Uses TensorBoard to visualize training and validation curves.

11. Model Evaluation:

Loads the saved model weights.
Uses the model to make predictions on the validation set.
Calculates and prints the F1 score and classification report.

12. Conclusion:

Creates and compiles a new model for future use.
Prints the F1 score and classification report on the validation set.

The code essentially demonstrates the process of fine-tuning a DistilBERT model for text classification using TensorFlow and Keras.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
dbert_model		dbert_model
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
support_tickets_classification_original.ipynb		support_tickets_classification_original.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

1. Package Installation and Imports:

2. Preprocessing and Cleaning Functions:

3. Reading and Cleaning the Dataset:

4. Loading DistilBERT Tokenizer and Model:

5. Preparing Input for the Model:

6. Creating a Basic NN Model Using DistilBERT Embeddings:

7. Saving Model Input in Pickle Files:

8. Train-Test Split and Model Compilation:

9. Training the Model:

10. Tensorboard Visualization:

11. Model Evaluation:

12. Conclusion:

About

Uh oh!

Releases

Packages

Languages

FotieMConstant/support-tickets-classification

Folders and files

Latest commit

History

Repository files navigation

1. Package Installation and Imports:

2. Preprocessing and Cleaning Functions:

3. Reading and Cleaning the Dataset:

4. Loading DistilBERT Tokenizer and Model:

5. Preparing Input for the Model:

6. Creating a Basic NN Model Using DistilBERT Embeddings:

7. Saving Model Input in Pickle Files:

8. Train-Test Split and Model Compilation:

9. Training the Model:

10. Tensorboard Visualization:

11. Model Evaluation:

12. Conclusion:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages