For some reason, I have been a fascinated follower of U.S. political debates for decades. It’s always intriguing to watch how Democrats and Republicans clash during presidential debates. While these debates often feature a lot of optics, policy discussions are also crucial. After each debate, I would visit multiple media sites to see how they spun the events, which adds another layer of fascination in observing media coverage post-debate.

On September 10th, we witnessed one of the most-watched debates, with over 67 million viewers, between Vice President Kamala Harris and former President Donald Trump. The post-debate coverage from right-wing and left-wing media organizations presented very different portrayals of the debate, clearly reflecting significant political bias.

I wondered if we could use GenAI to address this issue, and the idea of creating a debate analyzer came to mind. It was a quick and straightforward thought, and the process turned out to be quite simple. What I did was:

I used AWS PartyRock to develop an app that allows users to upload debate transcripts. The app then leverages LLMs to analyze the transcripts and provide insights.

Requirement in Short:

The app allows users to upload a debate transcript, automatically identifying the two opponents and the key themes of the debate. It will then analyze each theme to determine who performed better on specific issues. Additionally, the app will assess overall debate qualities such as presentation, delivery, and ability to command attention. Based on these factors, the app will provide a concise summary of who won the debate both on individual themes and overall performance.

This in turn created 4 prompts:

Prompt 1 : Debate Summary : Beyond the specific themes, assess the overall performance and debating skills of each opponent in [Debate Transcript]. Consider factors like delivery, ability to command attention, rebuttals, and persuasiveness. Determine which opponent had the stronger overall debate performance.

Prompt 2 : Theme Analysis : For each of the key themes identified in [Debate Summary], analyze which opponent had a stronger position and argument based on the content in [Debate Transcript]. Provide a brief summary for each theme and also who won those key themes

Prompt 3 : Debate Performance : Beyond the specific themes, assess the overall performance and debating skills of each opponent in [Debate Transcript]. Consider factors like delivery, ability to command attention, rebuttals, and persuasiveness. Determine which opponent had the stronger overall debate performance.

Prompt 4 : Debate Chat : Based on the analysis in [Theme Analysis] and [Debate Performance], as well as the original [Debate Transcript], who do you think won the overall debate and why?

LLM available for above tasks are as below:

Claude 3 Haiku
Claude 3 Sonnet
Titan Text Lite
Titan Text Express
Jurassic-2 Mid
Jurassic-2 Ultra

Based on the prompts provided, I decided to go with Titan Text Express for this task.

Titan Text Express is a large language model that has been trained on a vast amount of text data, including high-quality academic and journalistic sources. This makes it well-suited for tasks that require in-depth analysis, critical thinking, and the ability to synthesize information from multiple sources, as is the case with the prompts given.

Additionally, Titan Text Express has been optimized for efficiency and speed, which would be beneficial for generating concise and coherent responses within the constraints of the prompts.

For the prompts provided, I went with following temperature and top-p settings for Titan Text Express:

Prompt 1 (Debate Summary): Temperature: 0.7 Top-p: 0.9
Prompt 2 (Theme Analysis): Temperature: 0.7 Top-p: 0.9
Prompt 3 (Debate Performance): Temperature: 0.7 Top-p: 0.9
Prompt 4 (Debate Chat): Temperature: 0.7 Top-p: 0.9

The reason behind my choosing those values is:

Temperature: A temperature of 0.7 strikes a good balance between generating diverse and creative responses, while still maintaining coherence and relevance to the prompts. This setting allows the model to explore a range of possible outputs, without straying too far from the core task.

Top-p: A top-p of 0.9 ensures that the model focuses on generating the most likely and relevant tokens, while still allowing for a degree of diversity in the output. This setting helps to produce responses that are focused and on-topic, without being overly repetitive or predictable.

These settings are best as they have been found to work well for tasks that require in-depth analysis, critical thinking, and the ability to synthesize information from multiple sources, as is the case with the prompts provided.

That's it. I have developed a brand new Debate Analyzer Pro application. I’ll leave it to you to explore and, if you’d like, to find out who won the recent debate as well.

Can LLMs Truly Overcome Bias?

This is an intriguing question. While LLMs are designed to provide objective insights, it's important to recognize that they are not entirely free from bias. This is because LLMs are trained on extensive datasets drawn from a variety of sources. These datasets inevitably contain biases present in the original media and literature.

As a result, even though LLMs strive for impartiality, they can still reflect the biases embedded in their training data. To address this, it’s essential to be aware of and understand these biases. Recognizing the limitations and potential biases in the training datasets helps users interpret LLM analyses with a more nuanced perspective.

In essence, while tools like Titan Text Express are powerful for analysis, they are not immune to bias. Being mindful of this allows users to apply these insights more effectively and critically, ensuring a more balanced and objective evaluation.

Here are several steps you can take to reduce bias in LLMs:

Ensure that prompts are clear and specific, and use neutral language to minimize bias.
Cross-check results using multiple LLMs to better understand and address potential biases.
Utilize human reviews to validate results, and consider crowdsourcing feedback for additional insights.
Conduct regular bias audits and implement strategies to identify and address bias early.

If you’d like to see the app I developed, check the link here. https://partyrock.aws/u/indika/AmAeA7Nmb/Debate-Analyzer-Pro

AWS PartyRock powers Debate Analyzer Pro: Breaking Traditional Media Bias