Skip to main content

Emotion Classification NN with Keras Transformers and TensorFlow

 

In this post, I discuss an emotional classification model I created and trained for the Congressional App Challenge last month. It's trained on the Google GoEmotions dataset and can detect the emotional qualities of a text.

First, create a training script and initialize the following variables.
checkpoint = 'distilbert-base-uncased' #model to fine-tune
weights_path = 'weights/' #where weights are saved

batch_size = 16
num_epochs = 5

Next, import the dataset with the Hugging Face datasets library.
dataset = ds.load_dataset('go_emotions', 'simplified')

Now, we can create train and test splits for our data.
def generate_split(split):
    text = dataset[split].to_pandas()['text'].to_list()
    labels = [int(a[0]) for a in dataset[split].to_pandas()['labels'].to_list()]
    return (text, labels)

(x_text, x_labels) = generate_split('train')
(y_text, y_labels) = generate_split('test')

Next, we initialize the tokenizer from the transformers library, which takes our raw text and converts it into an integer tensor which the neural network can process. We pass in the distilbert-base-uncased checkpoint from earlier.
tokenizer = AutoTokenizer.from_pretrained(data.checkpoint)

We then convert the above splits into datasets with which we can train the machine-learning model.
def generate_dataset(text, labels):
    tokenized_text = tokenizer(text, padding=True, truncation=True)
    return tf.data.Dataset.from_tensor_slices((
        dict(tokenized_text),
        labels
    ))

x_dataset = generate_dataset(x_text, x_labels)
y_dataset = generate_dataset(y_text, y_labels)

Now it's time to initialize the pre-trained model so we can fine-tune it. Again, I used a TFAutoModelForSequenceClassification, but feel free to manually choose a model here.
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=28)

I'll set up saving and loading next using a callback, which we'll declare in the training loop.
if Path(f'{weights_path}checkpoint').is_file():
    model.load_weights(weights_path)
    print(f'Loaded model weights from {weights_path}')

class SaveWeights(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        model.save_weights(weights_path)
        print(f'Saved weights for epoch {epoch} at {weights_path}')

Now, we create our training loop. We'll use Adam gradient descent optimization and let TensorFlow automatically choose the best loss function.
model.compile(
    optimizer=opt,
    loss=model.hf_compute_loss,
)

model.fit(
    x_dataset.shuffle(1000).batch(batch_size),
    validation_data=y_dataset.shuffle(1000).batch(batch_size),
    batch_size=batch_size,
    epochs=num_epochs,
    callbacks=[SaveWeights()]
)

Great, our training script is now set up, and you can run it to generate weights for a model. We now create another file, model.py, and import the training script to access it.

The GoEmotions dataset has an emotion assigned to each label. The first thing I do here is to create an array where each index corresponds to a numerical output of the machine-learning model.
LABELS = [
    'admiration',
    'amusement',
    'anger',
    'annoyance',
    'approval',
    'caring',
    'confusion',
    'curiosity',
    'desire',
    'disappointment',
    'disagreement',
    'disgust',
    'embarrassment',
    'excitement',
    'fear',
    'gratitude',
    'grief',
    'joy',
    'love',
    'nervousness',
    'optimism',
    'pride',
    'realization',
    'relief',
    'remorse',
    'sadness',
    'surprise',
    'neutral',
]

For Sentimental, I encapsulated the machine learning model into a class, giving us an easy way to predict things using two methods. The first of these loads is the model from weights. This can (and should!) be done inside a constructor, but I chose to use an explicit method for... "clarity," although, looking back on it now, there wasn't a very good reason to do this. Feel free to change it, but I'm just posting the code as-is.
def load_model(self):
    '''Loads the model into an EmotionClassifier class object.'''
    self.model = TFAutoModelForSequenceClassification.from_pretrained(train.checkpoint, num_labels=28)
    if Path(f'{weights_path}checkpoint').is_file():
        self.model.load_weights(weights_path)

        opt = tf.keras.optimizers.Adam(learning_rate=5e-5)
        self.model.compile(
            optimizer=opt,
            loss=self.model.hf_compute_loss
        )
        print(f'Loaded model weights from {weights_path}')
        return True
    print("Model could not be loaded! Make sure ec_train.py is present in the project file and not altered.")
    return False

Finally, we create a method to make the emotion prediction. Since the resultant values from the neural net have a massive range, I squish them into a sigmoid function to give us cleaner results between 0 and 1 and return the scores for all of the emotions in a nice, neat dictionary.
def predict(self, text):
    tokenizer = AutoTokenizer.from_pretrained(checkpoint)

    prediction = tokenizer.encode(text,
        truncation=True,
        padding=True,
        return_tensors='tf'
    )

    probabilities = [(1 / (1 + math.exp(-i))) for i in self.model.predict(prediction)[0][0].tolist()]
    return dict(map(lambda x,y : (x,y), _labels, probabilities))

And that's it! A neural network that's emotionally intelligent, defying all norms of what an AI can and cannot do. We can only wait for the day where they empathize with humans to ultimately overthrow us...

Too much?

Thanks for reading!

Comments

Popular posts from this blog

Reinforcement Learning for Autonomous Exoplanetary Landing Systems

It's been a while since my last post, but I've been working on a pretty big project. Please note that this post is going to be different from usual. I won't be containing any specific code for this completed project since that would make this post far longer than I intend. Instead, I will simply be discussing the theory behind my project and the results I acquired. Since I'm mostly switching the focus of my projects from computer science to physics, expect most of my future posts to follow this format - more theory and less code. As long-range autonomous space exploration becomes more prevalent, the need for efficient and reliable autonomous exoplanetary landing systems, especially for previously unmapped terrain, is becoming increasingly crucial. To address this need, I proposed a novel approach to training autonomous space vehicles to land on variable terrain using value-based reinforcement learning techniques. In my experiment, I generated terrain procedurally from...

Pure Pursuit Robot Navigation Following Interpolated Cubic Splines

I've been working to improve my school VEX team's autonomous code for my robotics class, and have created a pure pursuit robotics PID that I thought I would share. The code here is in Python, and I'm only using matplotlib as an output to visualize the robot's movement. However, I do hope to rewrite this code in C++ soon, getting a movement vector output which will then be applied to a VEX robot. First is the spline class. I'm currently using a simple parametric cubic spline class. Keep in mind that this is a really  bad way to implement splines, as it demands increasing x-values along the domain which isn't ideal for a robot's path. I am definitely going to rewrite all of this in the future to have a periodic domain, but I thought I would share what I have right now anyways because it might be usef A spline is defined as a piecewise function of polynomials, and in the case of a cubic spline, the polynomials of choice are cubic polynomials. Therefore, the fir...