No results found.
Şevval Alper

Şevval Alper

AI Researcher
19 Articles
Stay up-to-date on B2B Tech

Şevval is an AI researcher at AIMultiple. She has previous research experience in pseudorandom number generation using chaotic systems.

Research interests

Şevval focuses on AI coding tools, AI agents, and quantum technologies.

She is part of the AIMultiple benchmark team, conducting assessments and providing insights to help readers understand various emerging technologies and their applications.

Professional experience

She contributed to organizing and guiding participants in three “CERN International Masterclasses - hands-on particle physics” events in Türkiye, working alongside faculty to facilitate learning.

Education

Şevval holds a Bachelor's degree in Physics from Middle East Technical University.

Latest Articles from Şevval

AIJan 13

Text-to-Video Generator Benchmark in 2026

A text-to-video generator is an AI system that turns written prompts into short videos by generating visuals, motion, and sometimes audio directly from natural language.

Agentic AIDec 29

Code Execution with MCP: A New Approach to AI Agent Efficiency

Anthropic introduced a method in which AI agents interact with Model Context Protocol (MCP) servers by writing executable code rather than making direct calls to tools. The agent treats tools as files on a computer, finds what it needs, and uses them directly with code, so intermediate data doesn’t have to pass through the model’s memory.

AIDec 29

Speech-to-Text Benchmark: Deepgram vs. Whisper in 2026

We benchmarked the leading speech-to-text (STT) providers, focusing specifically on healthcare applications. Our benchmark used real-world examples to assess transcription accuracy in medical contexts, where precision is crucial. Benchmark results Based on both WER and CER results, GPT-4o-transcribe demonstrates the highest transcription accuracy among all evaluated speech-to-text systems.

AIDec 26

LLM Parameters: GPT-5 High, Medium, Low and Minimal

New LLMs, such as OpenAI’s GPT-5 family, come in different versions (e.g., GPT-5, GPT-5-mini, and GPT-5-nano) and with various parameter settings, including high, medium, low, and minimal. Below, we explore the differences between these model versions by gathering their benchmark performance and the costs to run the benchmarks. Price vs.

AIDec 25

Screenshot to Code: Lovable vs v0 vs Bolt in 2026

During my 20 years as a software developer, I led many front-end teams in developing pages based on designs that were inspired by screenshots. Designs can be transferred to code using AI tools.

AIDec 23

E-Commerce AI Video Maker Benchmark: Veo 3 vs Sora 2

Product visualization plays a crucial role in e-commerce success, yet creating high-quality product videos remains a significant challenge. Recent advancements in AI video generation technology offer promising solutions.

AIDec 19

OCR Benchmark: Text Extraction / Capture Accuracy [2026]

OCR accuracy is critical for many document processing tasks and SOTA multi-modal LLMs are now offering an alternative to OCR.

AIDec 19

eCommerce AI Image Editing: GPT Images & Nano Banana

AI image editing tools analyze and automatically adjust product photos, allowing eCommerce businesses to enhance quality, remove backgrounds, or modify details with minimal effort. We tested the top 7 AI image editing tools on 20 images and 20 prompts across five dimensions, including prompt adaptability, realism, shadows, color rendering, and image quality.

Agentic AIDec 9

AI Agents: Operator vs Browser Use vs Project Mariner ['26]

AI agents are increasingly marketed as end-to-end digital workers, but real-world performance can vary widely depending on the task, tools, and execution environment. To understand what these systems can genuinely deliver today, we conducted hands-on benchmarking across practical business scenarios.

AIDec 3

AI Reasoning Benchmark: MathR-Eval in 2026

We evaluated eight leading LLMs using a 100-question mathematical reasoning dataset, MathR-Eval, to measure how well each model solves structured, logic-based math problems. All models were tested zero-shot, with identical prompts and standardized answer checking. This enabled us to measure pure reasoning accuracy and compare both reasoning and non-reasoning models under the same conditions.