Uçtan Uca Yöntemi İle Büyük Dil Modeli Tabanlı Sohbet Botlarının Performans Değerlendirmesi:Llama-8b,Llama-7b,Gemma-7b ve Mistral-7b

Karahoca, Adem

Uçtan Uca Yöntemi İle Büyük Dil Modeli Tabanlı Sohbet Botlarının Performans Değerlendirmesi:Llama-8b,Llama-7b,Gemma-7b ve Mistral-7b

dc.contributor.advisor	Karahoca, Adem
dc.contributor.author	Karahoca, Adem
dc.contributor.other	02.02. Department of Computer Engineering
dc.contributor.other	02. Faculty of Engineering
dc.contributor.other	01. MEF University
dc.date.accessioned	2026-02-05T20:04:26Z
dc.date.available	2026-02-05T20:04:26Z
dc.date.issued	2025
dc.description.abstract	Bu çalışma, müşteri destek sohbet robotları bağlamında büyük dil modellerinin (LLM) performansını uçtan uca (E2E) değerlendirme çerçevesi kullanarak incelemektedir. Özellikle, Gemma-7B, Mistral-7B,Llama-8B ve Llama-7B adlı dört önde gelen açık kaynak model; kullanıcı sorgularını anlamada ve anlamlı, doğru yanıtlar üretmede gösterdikleri başarıya göre karşılaştırılmıştır. İncelenen chatbot uygulaması, eğitim içerikleri sunan bir dijital platformda danışmanlık hizmeti vermek amacıyla tasarlanmış ve 3000'den fazla özenle hazırlanmış soru-cevap çiftiyle test edilmiştir. Değerlendirme süreci, hem anlamsal hem de sözcük düzeyinde ölçütleri birleştirmektedir. Model yanıtlarının uzmanlarca yazılmış yanıtlarla ne derece örtüştüğünü belirlemek için kosinüs benzerliği; sözcük düzeyindeki doğruluğu ölçmek için ise ROUGE metrikleri kullanılmıştır. Bulgular, Gemma-7B ve Llama-8B modelinin tüm metrikler boyunca en tutarlı performansı sergilediğini, Mistral-7B'nin dengeli ancak zaman zaman değişken çıktılar ürettiğini, Llama-7B'nin ise yapısal olarak güçlü olmasına rağmen anlamlı ve bağlama uygun yanıtlar üretmede zorlandığını göstermektedir. Sonuçlar, gerçek dünya chatbot uygulamaları için model seçiminin pratik sonuçlarını ortaya koymakta ve LLM performansının müşteri etkileşimi bağlamında değerlendirilmesinde çok boyutlu analiz yöntemlerinin önemini vurgulamaktadır.
dc.description.abstract	This study investigates the performance of large language models (LLMs) within the context of customer support chatbots by employing an end-to-end (E2E) evaluation framework. Specifically, it compares three prominent open-source models (Gemma-7B, Mistral-7B, Llama-7B and Llama-8B) based on their ability to comprehend and respond to user queries in a meaningful and accurate manner. The chatbot application under review was designed to provide assistance on an educational content platform and was tested using over 3000 curated question-answer pairs. The evaluation combines both semantic and lexical metrics, using cosine similarity to measure the alignment of model responses with expert-written answers, and ROUGE metrics to assess word-level accuracy. Additionally, the study incorporates prompt engineering techniques and analyses how models handle random or off-topic inputs, providing a comprehensive view of their reliability and contextual sensitivity. Results indicate that Gemma-7B and Llama-8B performs most consistently across all metrics, while Mistral-7B offers balanced outputs with occasional variance. Llama-7B, although structurally robust, struggled to deliver semantically aligned and contextually appropriate responses. Overall, the findings highlight the practical implications of model selection for real-world chatbot deployments and demonstrate the importance of multi-dimensional evaluation methods when assessing LLM performance in customer interaction settings.	en_US
dc.identifier.uri	https://tez.yok.gov.tr/UlusalTezMerkezi/TezGoster?key=CtwiQkYvArAb95Ufpfs_vlrnXwAkJSbquBoosCbqiWErU_NynrpJjvK99SmMlrjq
dc.identifier.uri	https://hdl.handle.net/20.500.11779/3193
dc.language.iso	en
dc.subject	Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol
dc.subject	Computer Engineering and Computer Science and Control	en_US
dc.title	Uçtan Uca Yöntemi İle Büyük Dil Modeli Tabanlı Sohbet Botlarının Performans Değerlendirmesi:Llama-8b,Llama-7b,Gemma-7b ve Mistral-7b
dc.title	Performance Evaluation of LLM Based Chatbots With E2E Method: Llama-8b, Llama-7b, Gemma-7b and Mistral-7b	en_US
dc.type	Master Thesis	en_US
dspace.entity.type	Publication
gdc.author.institutional	Karahoca, Adem
gdc.coar.access	metadata only access
gdc.coar.type	text::thesis::master thesis
gdc.description.department	Fen Bilimleri Enstitüsü / Bilişim Teknolojileri Mühendisliği Ana Bilim Dalı
gdc.description.endpage	64
gdc.identifier.yoktezid	986645
gdc.publishedmonth	Eylül
gdc.yokperiod	YÖK - 2025-26
relation.isAuthorOfPublication	f5501168-1d99-490d-878e-dc43f3741ab7
relation.isAuthorOfPublication.latestForDiscovery	f5501168-1d99-490d-878e-dc43f3741ab7
relation.isOrgUnitOfPublication	05ffa8cd-2a88-4676-8d3b-fc30eba0b7f3
relation.isOrgUnitOfPublication	0d54cd31-4133-46d5-b5cc-280b2c077ac3
relation.isOrgUnitOfPublication	a6e60d5c-b0c7-474a-b49b-284dc710c078
relation.isOrgUnitOfPublication.latestForDiscovery	05ffa8cd-2a88-4676-8d3b-fc30eba0b7f3

Collections

Yüksek Lisans Tezleri

Uçtan Uca Yöntemi İle Büyük Dil Modeli Tabanlı Sohbet Botlarının Performans Değerlendirmesi:Llama-8b,Llama-7b,Gemma-7b ve Mistral-7b

Files

Collections