Skip to main content

Emotion Classification NN with Keras Transformers and TensorFlow

 

In this post, I discuss an emotional classification model I created and trained for the Congressional App Challenge last month. It's trained on the Google GoEmotions dataset and can detect the emotional qualities of a text.

First, create a training script and initialize the following variables.
checkpoint = 'distilbert-base-uncased' #model to fine-tune
weights_path = 'weights/' #where weights are saved

batch_size = 16
num_epochs = 5

Next, import the dataset with the Hugging Face datasets library.
dataset = ds.load_dataset('go_emotions', 'simplified')

Now, we can create train and test splits for our data.
def generate_split(split):
    text = dataset[split].to_pandas()['text'].to_list()
    labels = [int(a[0]) for a in dataset[split].to_pandas()['labels'].to_list()]
    return (text, labels)

(x_text, x_labels) = generate_split('train')
(y_text, y_labels) = generate_split('test')

Next, we initialize the tokenizer from the transformers library, which takes our raw text and converts it into an integer tensor which the neural network can process. We pass in the distilbert-base-uncased checkpoint from earlier.
tokenizer = AutoTokenizer.from_pretrained(data.checkpoint)

We then convert the above splits into datasets with which we can train the machine-learning model.
def generate_dataset(text, labels):
    tokenized_text = tokenizer(text, padding=True, truncation=True)
    return tf.data.Dataset.from_tensor_slices((
        dict(tokenized_text),
        labels
    ))

x_dataset = generate_dataset(x_text, x_labels)
y_dataset = generate_dataset(y_text, y_labels)

Now it's time to initialize the pre-trained model so we can fine-tune it. Again, I used a TFAutoModelForSequenceClassification, but feel free to manually choose a model here.
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint, num_labels=28)

I'll set up saving and loading next using a callback, which we'll declare in the training loop.
if Path(f'{weights_path}checkpoint').is_file():
    model.load_weights(weights_path)
    print(f'Loaded model weights from {weights_path}')

class SaveWeights(tf.keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs=None):
        model.save_weights(weights_path)
        print(f'Saved weights for epoch {epoch} at {weights_path}')

Now, we create our training loop. We'll use Adam gradient descent optimization and let TensorFlow automatically choose the best loss function.
model.compile(
    optimizer=opt,
    loss=model.hf_compute_loss,
)

model.fit(
    x_dataset.shuffle(1000).batch(batch_size),
    validation_data=y_dataset.shuffle(1000).batch(batch_size),
    batch_size=batch_size,
    epochs=num_epochs,
    callbacks=[SaveWeights()]
)

Great, our training script is now set up, and you can run it to generate weights for a model. We now create another file, model.py, and import the training script to access it.

The GoEmotions dataset has an emotion assigned to each label. The first thing I do here is to create an array where each index corresponds to a numerical output of the machine-learning model.
LABELS = [
    'admiration',
    'amusement',
    'anger',
    'annoyance',
    'approval',
    'caring',
    'confusion',
    'curiosity',
    'desire',
    'disappointment',
    'disagreement',
    'disgust',
    'embarrassment',
    'excitement',
    'fear',
    'gratitude',
    'grief',
    'joy',
    'love',
    'nervousness',
    'optimism',
    'pride',
    'realization',
    'relief',
    'remorse',
    'sadness',
    'surprise',
    'neutral',
]

For Sentimental, I encapsulated the machine learning model into a class, giving us an easy way to predict things using two methods. The first of these loads is the model from weights. This can (and should!) be done inside a constructor, but I chose to use an explicit method for... "clarity," although, looking back on it now, there wasn't a very good reason to do this. Feel free to change it, but I'm just posting the code as-is.
def load_model(self):
    '''Loads the model into an EmotionClassifier class object.'''
    self.model = TFAutoModelForSequenceClassification.from_pretrained(train.checkpoint, num_labels=28)
    if Path(f'{weights_path}checkpoint').is_file():
        self.model.load_weights(weights_path)

        opt = tf.keras.optimizers.Adam(learning_rate=5e-5)
        self.model.compile(
            optimizer=opt,
            loss=self.model.hf_compute_loss
        )
        print(f'Loaded model weights from {weights_path}')
        return True
    print("Model could not be loaded! Make sure ec_train.py is present in the project file and not altered.")
    return False

Finally, we create a method to make the emotion prediction. Since the resultant values from the neural net have a massive range, I squish them into a sigmoid function to give us cleaner results between 0 and 1 and return the scores for all of the emotions in a nice, neat dictionary.
def predict(self, text):
    tokenizer = AutoTokenizer.from_pretrained(checkpoint)

    prediction = tokenizer.encode(text,
        truncation=True,
        padding=True,
        return_tensors='tf'
    )

    probabilities = [(1 / (1 + math.exp(-i))) for i in self.model.predict(prediction)[0][0].tolist()]
    return dict(map(lambda x,y : (x,y), _labels, probabilities))

And that's it! A neural network that's emotionally intelligent, defying all norms of what an AI can and cannot do. We can only wait for the day where they empathize with humans to ultimately overthrow us...

Too much?

Thanks for reading!

Comments

Popular posts from this blog

Pure Pursuit Robot Navigation Following Interpolated Cubic Splines

I've been working to improve my school VEX team's autonomous code for my robotics class, and have created a pure pursuit robotics PID that I thought I would share. The code here is in Python, and I'm only using matplotlib as an output to visualize the robot's movement. However, I do hope to rewrite this code in C++ soon, getting a movement vector output which will then be applied to a VEX robot. First is the spline class. I'm currently using a simple parametric cubic spline class. Keep in mind that this is a really  bad way to implement splines, as it demands increasing x-values along the domain which isn't ideal for a robot's path. I am definitely going to rewrite all of this in the future to have a periodic domain, but I thought I would share what I have right now anyways because it might be usef A spline is defined as a piecewise function of polynomials, and in the case of a cubic spline, the polynomials of choice are cubic polynomials. Therefore, the fir...

Alpha-Beta Pruning Minimax Chess Bot

I've written a chess bot that uses the minimax algorithm to play chess, even though I barely know how to play. What could possibly  go wrong? Minimax is a decision rule in game theory that seeks to minimize the possible loss of a situation, or in this case, to prevent the AI from being at a "disadvantage" with the player. The bot plays through a bunch of moves, determines how much of a loss they incur, and chooses a move with the lowest disadvantage. My implementation is in Python and uses the python-chess   library to create a chessboard. So, let's look at the code. First, I set up the board as follows: board = chess . Board() values = {chess . PAWN: 1 , chess . KNIGHT: 3 , chess . BISHOP: 3 , chess . ROOK: 5 , chess . QUEEN: 9 , chess . KING: 0 } To calculate advantage, I iterate through all of the pieces on the chess board and add or subtract their values from a running count depending on their color. By weighing more valuable pieces, you could be...