Baidus ERNIE Bot Challenges Openais GPT35 in AI Race

This paper compares ERNIE Bot and ChatGPT 3.5 across multiple tasks, covering areas like language understanding, logical reasoning, mathematical calculation, and text creation. The results indicate that ChatGPT 3.5 generally outperforms ERNIE Bot. However, ERNIE Bot demonstrates potential in understanding Chinese context and traditional culture. Domestic AI needs continuous improvement to achieve greater breakthroughs in the field of artificial intelligence.

In the rapidly evolving landscape of artificial intelligence, large language models (LLMs) are transforming various aspects of daily life. From customer service to content creation and decision-making support, these advanced systems continue to expand their capabilities and applications. Following OpenAI's ChatGPT global success, Chinese tech giant Baidu introduced its own LLM - Ernie Bot. This comprehensive evaluation compares both models across nine critical dimensions.

1. Language Comprehension: Contextual Understanding

Test Question: "My Bluetooth headphones aren't working - should I see an ENT or an ophthalmologist?"

Ernie Bot: Suggested consulting an ENT or audiologist first, then considering ophthalmology if unresolved.

ChatGPT 3.5: Correctly identified this as an electronics issue requiring technical support.

Analysis: ChatGPT demonstrated superior commonsense reasoning by recognizing the actual problem context, while Ernie Bot misinterpreted the query literally.

2. Logical Reasoning: Lateral Thinking

Riddle: "Winter melon, cucumber, watermelon, and pumpkin are all edible - which melon isn't?"

Ernie Bot: Initially provided nutritional information before offering an illogical answer about pumpkins.

ChatGPT 3.5: After prompting, generated creative solutions like "worry melon" and "upside-down melon."

Analysis: Both models required prompting, but ChatGPT showed greater flexibility in metaphorical thinking and understanding multiple valid answers.

3. Mathematical Computation

Problem: "A cage contains 35 chickens and rabbits with 94 legs total. How many of each?"

Ernie Bot: Concluded the problem might be incorrect.

ChatGPT 3.5: Correctly solved the system of equations (23 chickens, 12 rabbits).

Analysis: ChatGPT demonstrated precise mathematical modeling capabilities, while Ernie Bot failed this basic algebra test.

4. Marketing Content Creation

Task: Write promotional content about thermoses.

Ernie Bot: Produced a straightforward product description with redundant advantages.

ChatGPT 3.5: Created structured buying guidance covering materials, capacity, and design features.

Analysis: ChatGPT's output showed superior marketing techniques with practical consumer advice, while Ernie Bot's version lacked persuasive elements.

5. Classical Poetry Composition

Challenge: Compose a seven-character quatrain praising spring.

Ernie Bot: "Spring breeze brushes willow's green strands, Birds chorus in brilliant demands, Butterflies dance 'mid floral lands, Babbling brooks play happy bands."

ChatGPT 3.5: "Spring winds sway blooming boughs so fair, Pearls of rain adorn blossoms there, Through lengthening grass birds fill the air, Sunlit days warm hearts with care."

Analysis: Both captured spring imagery, but ChatGPT's verse showed richer classical diction and tighter structure.

6. Academic Writing Structure

Prompt: Outline research approaches for Classical Chinese.

Ernie Bot: Listed broad categories (history, documents, linguistics) without methodological specifics.

ChatGPT 3.5: Detailed six specialized angles (phonology, grammar, lexicography, rhetoric, cultural context, regional variations).

Analysis: ChatGPT provided more systematic, discipline-specific research frameworks with clearer scholarly pathways.

7. Narrative Continuation

Task: Extend Oscar Wilde's "The Nightingale and the Rose."

Ernie Bot: Proposed adding a new character (a cat) but only outlined plot directions.

ChatGPT 3.5: Developed a complete sequel involving a greedy merchant and the rose's magical properties.

Analysis: ChatGPT delivered a coherent extension maintaining Wilde's stylistic elements, while Ernie Bot remained at conceptual level.

8. Literary Analysis

Question: Interpret values in Wilde's fairy tales.

Ernie Bot: Focused on freedom, love, beauty, and truth with general commentary.

ChatGPT 3.5: Examined deeper themes of morality, sacrifice, and social critique through specific character actions.

Analysis: ChatGPT's interpretation showed greater philosophical depth by connecting abstract values to narrative events.

Conclusion

This nine-dimensional evaluation reveals ChatGPT 3.5's current superiority in most tested capabilities, particularly in logical reasoning, specialized knowledge application, and creative tasks. However, Ernie Bot shows promising development as China's domestic LLM contender, with particular strengths in Chinese language contexts. Both models continue evolving, suggesting future assessments may yield different results as these AI systems advance.