Questions tagged [language-model]
Language models are used extensively in Natural Language Processing (NLP) and are probability distributions over a sequence of words or terms.
149 questions
2 votes
0 answers
34 views
Evaluation of token importance attribution based on human rationales
I am working on evaluating an explainability method for a text classification model that predicts whether a given text sequence contains hate speech or not. The method outputs token-level importance ...
0 votes
0 answers
69 views
How much improvement does OpenAI o1 achieve from the chain of thought?
https://openai.com/index/learning-to-reason-with-llms/ OpenAI o1 also add more data than the last version of LLM.
1 vote
1 answer
99 views
Callback handlers in Langchain
This might be an odd question, but why is there two codes for the class BaseCallbackHandler? https://api.python.langchain.com/en/latest/_modules/langchain_core/callbacks/base.html#BaseCallbackHandler ...
0 votes
1 answer
73 views
What languages llama2 supports?
Which languages llama2 supports? I looked at the docs and huggingface but I couldn't find a list. Just it says usage in other languages than English as out-of-scope.
0 votes
1 answer
71 views
How can I get the list of pretrained large language models?
Is there any place I can get the list of pre-trained large language models in a neat way? Despite the most common ones like gpt, BARD, llama2, which llm do you suggest that can be used for RAG and ...
0 votes
1 answer
99 views
How to check the license of a LLM for specific use?
How to check if a large language model has a license allowing to fine tune the model and then publish it publicly? How can I be sure that I can use and fine-tune a large language model without ...
1 vote
2 answers
117 views
How to choose ideal pretrained model for fine-tuning?
I started to work with LLMs lately and want to know how people choose their pre-trained models in their fine-tuning tasks? What is the criteria to choose the base model and which factors affect?
0 votes
1 answer
58 views
Is Machine Reading Comprehension (MRC) outdated?
I recently went through some litterature about knowledge-enhanced language models and found connections with the Machine Reading Comprehension (MRC) task. However, I couldn't find papers more recent ...
1 vote
1 answer
689 views
How can I leverage machine learning for log analysis?
I am new to data science and trying to find possibilities of using datascience in tasks. I have a set of logs which I want to convert to json. The logs are more or less of same format and I can write ...
0 votes
1 answer
235 views
Purely extractive Language Model
Given an email thread, I am trying to extract the body of the most recent email. I used to do that with rules. Now I am testing Large Language Models (LLM) to see if I they provide a less ad hoc ...
0 votes
1 answer
327 views
Open-Source Large Language Models (LLM): Your experience and recommendation
I’m looking for an open-source LLM for a new project. I want to use it for instructions and to fine-tune the model to a specific domain like legal and rights. Some LLMs are open-source, but they didn’...
0 votes
1 answer
381 views
What is the input to an encoder-decoder transformer in next word prediction task?
I'm trying to understand how encoder-decoder architectures are used, or if they are used at all, for generative tasks that do not require an explicit prompt (ie. machine translation, summarization, ...
1 vote
1 answer
4k views
Why is 0.7, in general, the default value of temperature for LLMs?
I have recently read through a lot of documentation and articles about Large Language Models (LLMs), and I have come to the conclusion that 0.7 is, most of the time, the default value for the ...
0 votes
0 answers
40 views
TFRobertaSequenceClassification for Address Normalization task
I have dataset with two column: one with faulty addresses, and other with correct addresses. I want to train a model such that, I can use it later for correcting all the incoming faulty addresses. I ...
0 votes
0 answers
1k views
How to read CSV File into Vector Store
I have a CSV file, and I am using langchain to read it into the vector store FAISS. My question is, since I have a CSV file, is RecursiveTextSplitter required? Put differently, consider the following ...