Wrong output from a pretrained model

My code below:

import torch
from transformers import T5Tokenizer, T5ForConditionalGeneration

# Load pre-trained T5 model and tokenizer
model_name = 't5-base'
tokenizer = T5Tokenizer.from_pretrained(model_name)
model = T5ForConditionalGeneration.from_pretrained(model_name)

# Function to generate elaboration
def generate_elaboration(prompt, max_length=150):
    # Prepare the input text in a format that T5 expects
    input_text = f"elaborate on: {prompt}"
    inputs = tokenizer.encode(input_text, return_tensors='pt')
    
    # Generate elaboration
    outputs = model.generate(
        inputs, 
        max_length=max_length, 
        num_return_sequences=1, 
        no_repeat_ngram_size=2, 
        early_stopping=True,
        temperature=0.7,
        top_p=0.9
    )
    
    # Decode the generated text
    elaboration = tokenizer.decode(outputs[0], skip_special_tokens=True)
    return elaboration

# Example usage
user_input = "What is GPT?"
elaboration = generate_elaboration(user_input)

print("User Input:", user_input)
print("Elaboration:", elaboration)

And my output:

User Input: What is GPT?
Elaboration: :::: What is GPT?

As you can see, the model is suppose to explain what GPT is. Instead, it just parrots the user input. How do I fix this?

Hi @chenphilip14, the model just doesn’t appear to understand your prompt well… not sure if it’s suited for this task at all. Maybe just use google/flan-t5-base instead, this should give you much better results.

1 Like

The model name is like this:

model_name = 'google/flan-t5-base'

And here is the answer I got:

User Input: What is GPT?
Elaboration: GPT is a standardized test for the measurement of the physical properties of atoms.

1 Like

Just what I needed. THank you so much!