Skip to main content

Questions tagged [language-model]

Language models are used extensively in Natural Language Processing (NLP) and are probability distributions over a sequence of words or terms.

2 votes
0 answers
34 views

I am working on evaluating an explainability method for a text classification model that predicts whether a given text sequence contains hate speech or not. The method outputs token-level importance ...
Marc's user avatar
  • 21
0 votes
0 answers
69 views

https://openai.com/index/learning-to-reason-with-llms/ OpenAI o1 also add more data than the last version of LLM.
CoderOnly's user avatar
  • 721
1 vote
1 answer
99 views

This might be an odd question, but why is there two codes for the class BaseCallbackHandler? https://api.python.langchain.com/en/latest/_modules/langchain_core/callbacks/base.html#BaseCallbackHandler ...
Justin Jonany's user avatar
0 votes
1 answer
73 views

Which languages llama2 supports? I looked at the docs and huggingface but I couldn't find a list. Just it says usage in other languages than English as out-of-scope.
heyula's user avatar
  • 47
0 votes
1 answer
71 views

Is there any place I can get the list of pre-trained large language models in a neat way? Despite the most common ones like gpt, BARD, llama2, which llm do you suggest that can be used for RAG and ...
heyula's user avatar
  • 47
0 votes
1 answer
99 views

How to check if a large language model has a license allowing to fine tune the model and then publish it publicly? How can I be sure that I can use and fine-tune a large language model without ...
heyula's user avatar
  • 47
1 vote
2 answers
117 views

I started to work with LLMs lately and want to know how people choose their pre-trained models in their fine-tuning tasks? What is the criteria to choose the base model and which factors affect?
heyula's user avatar
  • 47
0 votes
1 answer
58 views

I recently went through some litterature about knowledge-enhanced language models and found connections with the Machine Reading Comprehension (MRC) task. However, I couldn't find papers more recent ...
Barbara Gendron's user avatar
1 vote
1 answer
689 views

I am new to data science and trying to find possibilities of using datascience in tasks. I have a set of logs which I want to convert to json. The logs are more or less of same format and I can write ...
SUNITA GUPTA's user avatar
0 votes
1 answer
235 views

Given an email thread, I am trying to extract the body of the most recent email. I used to do that with rules. Now I am testing Large Language Models (LLM) to see if I they provide a less ad hoc ...
mirix's user avatar
  • 103
0 votes
1 answer
327 views

I’m looking for an open-source LLM for a new project. I want to use it for instructions and to fine-tune the model to a specific domain like legal and rights. Some LLMs are open-source, but they didn’...
Christian01's user avatar
0 votes
1 answer
381 views

I'm trying to understand how encoder-decoder architectures are used, or if they are used at all, for generative tasks that do not require an explicit prompt (ie. machine translation, summarization, ...
monopoly's user avatar
  • 103
1 vote
1 answer
4k views

I have recently read through a lot of documentation and articles about Large Language Models (LLMs), and I have come to the conclusion that 0.7 is, most of the time, the default value for the ...
jmpion's user avatar
  • 11
0 votes
0 answers
40 views

I have dataset with two column: one with faulty addresses, and other with correct addresses. I want to train a model such that, I can use it later for correcting all the incoming faulty addresses. I ...
learner_account's user avatar
0 votes
0 answers
1k views

I have a CSV file, and I am using langchain to read it into the vector store FAISS. My question is, since I have a CSV file, is RecursiveTextSplitter required? Put differently, consider the following ...
Karl 17302's user avatar

15 30 50 per page
1
2 3 4 5
10