Skip to main content

Emotion Classification NN with Keras Transformers and TensorFlow

 

In this post, I discuss an emotional classification model I created and trained for the Congressional App Challenge last month. It's trained on the Google GoEmotions dataset and can detect the emotional qualities of a text.

First, create a training script and initialize the following variables.
checkpoint = 'distilbert-base-uncased' #model to fine-tune
weights_path = 'weights/' #where weights are saved

batch_size = 16
num_epochs = 5

Next, import the dataset with the Hugging Face datasets library.
dataset = ds.load_dataset('go_emotions', 'simplified')

Now, we can create train and test splits for our data.
def generate_split(split):
    text = dataset[split].to_pandas()['text'].to_list()
    labels = [int(a[0]) for a in dataset[split].to_pandas()['labels'].to_list()]
    return (text, labels)

(x_text, x_labels) = generate_split('train')
(y_text, y_labels) = generate_split('test')

Next, we initialize the tokenizer from the transformers library, which takes our raw text and converts it into an integer tensor which the neural network can process. We pass in the distilbert-base-uncased checkpoint from earlier.
tokenizer = AutoTokenizer.from_pretrained(data.checkpoint)

We then convert the above splits into datasets with which we can train the machine-learning model.
def generate_dataset(text, labels):
    tokenized_text = tokenizer(text, padding=True, truncation=True)
    return tf.data.Dataset.from_tensor_slices((
        dict(tokenized_text),
        labels
    ))

x_dataset = generate_dataset(x_text, x_labels)
y_dataset = generate_dataset(y_text, y_labels)

Now it's time to initialize the pre-trained model so we can fine-tune it. Again, I used a TFAutoModelForSequenceClassification, but feel free to manually choose a model here.
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=28)

I'll set up saving and loading next using a callback, which we'll declare in the training loop.
if Path(f'{weights_path}checkpoint').is_file():
    model.load_weights(weights_path)
    print(f'Loaded model weights from {weights_path}')

class SaveWeights(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        model.save_weights(weights_path)
        print(f'Saved weights for epoch {epoch} at {weights_path}')

Now, we create our training loop. We'll use Adam gradient descent optimization and let TensorFlow automatically choose the best loss function.
model.compile(
    optimizer=opt,
    loss=model.hf_compute_loss,
)

model.fit(
    x_dataset.shuffle(1000).batch(batch_size),
    validation_data=y_dataset.shuffle(1000).batch(batch_size),
    batch_size=batch_size,
    epochs=num_epochs,
    callbacks=[SaveWeights()]
)

Great, our training script is now set up, and you can run it to generate weights for a model. We now create another file, model.py, and import the training script to access it.

The GoEmotions dataset has an emotion assigned to each label. The first thing I do here is to create an array where each index corresponds to a numerical output of the machine-learning model.
LABELS = [
    'admiration',
    'amusement',
    'anger',
    'annoyance',
    'approval',
    'caring',
    'confusion',
    'curiosity',
    'desire',
    'disappointment',
    'disagreement',
    'disgust',
    'embarrassment',
    'excitement',
    'fear',
    'gratitude',
    'grief',
    'joy',
    'love',
    'nervousness',
    'optimism',
    'pride',
    'realization',
    'relief',
    'remorse',
    'sadness',
    'surprise',
    'neutral',
]

For Sentimental, I encapsulated the machine learning model into a class, giving us an easy way to predict things using two methods. The first of these loads is the model from weights. This can (and should!) be done inside a constructor, but I chose to use an explicit method for... "clarity," although, looking back on it now, there wasn't a very good reason to do this. Feel free to change it, but I'm just posting the code as-is.
def load_model(self):
    '''Loads the model into an EmotionClassifier class object.'''
    self.model = TFAutoModelForSequenceClassification.from_pretrained(train.checkpoint, num_labels=28)
    if Path(f'{weights_path}checkpoint').is_file():
        self.model.load_weights(weights_path)

        opt = tf.keras.optimizers.Adam(learning_rate=5e-5)
        self.model.compile(
            optimizer=opt,
            loss=self.model.hf_compute_loss
        )
        print(f'Loaded model weights from {weights_path}')
        return True
    print("Model could not be loaded! Make sure ec_train.py is present in the project file and not altered.")
    return False

Finally, we create a method to make the emotion prediction. Since the resultant values from the neural net have a massive range, I squish them into a sigmoid function to give us cleaner results between 0 and 1 and return the scores for all of the emotions in a nice, neat dictionary.
def predict(self, text):
    tokenizer = AutoTokenizer.from_pretrained(checkpoint)

    prediction = tokenizer.encode(text,
        truncation=True,
        padding=True,
        return_tensors='tf'
    )

    probabilities = [(1 / (1 + math.exp(-i))) for i in self.model.predict(prediction)[0][0].tolist()]
    return dict(map(lambda x,y : (x,y), _labels, probabilities))

And that's it! A neural network that's emotionally intelligent, defying all norms of what an AI can and cannot do. We can only wait for the day where they empathize with humans to ultimately overthrow us...

Too much?

Thanks for reading!

Comments

Popular posts from this blog

Pure Pursuit Robot Navigation Following Interpolated Cubic Splines

I've been working to improve my school VEX team's autonomous code for my robotics class, and have created a pure pursuit robotics PID that I thought I would share. The code here is in Python, and I'm only using matplotlib as an output to visualize the robot's movement. However, I do hope to rewrite this code in C++ soon, getting a movement vector output which will then be applied to a VEX robot. First is the spline class. I'm currently using a simple parametric cubic spline class. Keep in mind that this is a really  bad way to implement splines, as it demands increasing x-values along the domain which isn't ideal for a robot's path. I am definitely going to rewrite all of this in the future to have a periodic domain, but I thought I would share what I have right now anyways because it might be usef A spline is defined as a piecewise function of polynomials, and in the case of a cubic spline, the polynomials of choice are cubic polynomials. Therefore, the fir...

Exploring Active Ragdoll Systems

  Active ragdolls is the name given to wobbly, physics-based character controllers which apply forces to ragdolls. You may have seen them implemented in popular games such as Human Fall Flat  and Fall Guys . This post introduces a technique I developed to create active ragdolls for a personal project, implemented in Unity. The system I will demonstrate is surprisingly simple and only requires a small amount of code. Unity has these beautiful things called Configurable Joints , which are joints that can, as the name suggests, be configured, with simulated motors on the X and YZ axes providing force to the joints. What we can do with this is map the motions of a regular  game character with an Animation Controller (an "animator clone") to our active ragdoll. Doing this means we only have to animate the animator clone for the active ragdoll to automatically be animated with it! Firstly, I created a ragdoll from a rigged character. (Side note: Mixamo is a great tool to q...