Generate Multiple Choice Questions with DeepSeek-V3 LLM

In this tutorial, I will generate multiple choice questions using an open-source model called DeepSeek-V3. At this moment, DeepSeek-V3 is one of the best large language models.

The user will provide a text snippet as input to the model and the model will create a few questions based on the given text. The model will also generate 4 choices for each question along with the correct option.

I will use OpenRouter’s API to access the model, however, you can also install DeepSeek-V3 locally on your system. To generate the MCQs in the desired structured format, I will use a Python library called Instructor.

How to access DeepSeek-V3 via Openrouter?

OpenRouter is a platform that hosts several open-source LLMs and VLMs. It also hosts DeepSeek-V3.

After creating your account on OpenRouter, you need to purchase some credits to access models like DeepSeek-V3. There are a few models on the platform that can be accessed for free as well.

Before accessing any model you must create an OpenRouter API secret key for your account. To create the API key, first sign in to openrouter.ai, then go to the Keys section.

openrouter deepseek v3 python

Click on the ‘Create Key’ button to generate the API key. Save the generated key securely.

Implement MCQ generation

Time to implement MCQ generation with LLM in Python. Make sure you have libraries like Instructor, OpenAI, and Pydantic installed in your system.

Import Python libraries

import instructor # version - 1.7.2
from openai import OpenAI
from pydantic import BaseModel

Initialize LLM client

client = instructor.from_openai(
    OpenAI(api_key = 'YOUR-OpenRouter-API-Key', base_url = "https://openrouter.ai/api/v1")
)

Define your desired output structure

The desired MCQ output should contain the main question, four choices, and the correct answer. We can use Pydantic’s BaseModel to create a class MCQ and define the structure of the output we want to generate.

class MCQ(BaseModel):
    question: str
    choice1: str
    choice2: str
    choice3: str
    choice4: str
    correct_answer: str

Generate questions

I will use a text passage from Wikipedia and pass it to the LLM in the prompt. I will use the OpenRouter API to send the prompt with the input text to the DeepSeek-V3 model.

text = """
In the basic version of this experiment, a coherent light source,
such as a laser beam, illuminates a plate pierced by two parallel slits,
and the light passing through the slits is observed on a screen behind the
plate. The wave nature of light causes the light waves passing through the
two slits to interfere, producing bright and dark bands on the screen –
a result that would not be expected if light consisted of classical
particles. However, the light is always found to be absorbed at the screen
at discrete points, as individual particles (not waves); the interference
pattern appears via the varying density of these particle hits on the screen.
Furthermore, versions of the experiment that include detectors at the slits
find that each detected photon passes through one slit
(as would a classical particle), and not through both slits (as would a wave).
However, such experiments demonstrate that particles do not form the
interference pattern if one detects which slit they pass through.
These results demonstrate the principle of wave–particle duality.
"""

mcqs = client.chat.completions.create_iterable(
    model="deepseek/deepseek-chat",
    messages=[
        {"role": "user", "content": f"""
            Extract multiple choice questions from the following text content:

            {text}
        """},
    ],
    response_model=MCQ,
)

‘mcqs’ is an iterable, it will generate one set of a question and its four choices at every iteration. I will store the generated MCQs in a list.

questions = []

# generate multiple choice questions
for q in mcqs:
    questions.append(q.model_dump())

You can check the number of questions generated by the LLM by running the following code.

len(questions)

Output: 5

Display generated MCQs

So, from the given text passage, our model could generate five questions. If we pass a longer text content, it may create more questions.

Let me print a few generated questions.

questions[0]

Output:

{'question': 'What is observed on the screen behind the plate in the basic version of the experiment?',
 'choice1': 'A single bright spot',
 'choice2': 'A single dark spot',
 'choice3': 'Bright and dark bands',
 'choice4': 'No pattern',
 'correct_answer': 'Bright and dark bands'}
questions[1]

Output:

{'question': 'What causes the interference pattern on the screen?',
 'choice1': 'The particle nature of light',
 'choice2': 'The wave nature of light',
 'choice3': 'The classical particle behavior',
 'choice4': 'The absence of light',
 'correct_answer': 'The wave nature of light'}
questions[2]

Output:

{'question': 'How is the light absorbed at the screen?',
 'choice1': 'As continuous waves',
 'choice2': 'As individual particles',
 'choice3': 'As a single particle',
 'choice4': 'As a single wave',
 'correct_answer': 'As individual particles'}

As you can see, the generated MCQs are in a consistent format. Without worrying about the format, we can easily use these outputs in any downstream task.

Leave a Reply

Your email address will not be published. Required fields are marked *