Using LLMs for Downstream Classification: Prompt, Verbalize, Train

cover
11 Jun 2025

Authors:

(1) Goran Muric, InferLink Corporation, Los Angeles, (California [email protected]);

(2) Ben Delay, InferLink Corporation, Los Angeles, California ([email protected]);

(3) Steven Minton, InferLink Corporation, Los Angeles, California ([email protected]).

Abstract and 1 Introduction

1.1 Motivation

2 Related Work and 2.1 Prompting techniques

2.2 In-context learning

2.3 Model interpretability

3 Method

3.1 Generating questions

3.2 Prompting LLM

3.3 Verbalizing the answers and 3.4 Training a classifier

4 Data and 4.1 Clinical trials

4.2 Catalonia Independence Corpus and 4.3 Climate Detection Corpus

4.4 Medical health advice data and 4.5 The European Court of Human Rights (ECtHR) Data

4.6 UNFAIR-ToS Dataset

5 Experiments

6 Results

7 Discussion

7.1 Implications for Model Interpretability

7.2 Limitations and Future Work

Reproducibility

Acknowledgment and References

A Questions used in ICE-T method

3.2 Prompting LLM

The LLMs are prompted in two occasions. First, they are prompted to obtain the set of secondary questions Q, as described in Section 3.1. Second, for each document, we prompt the LLM with the document and corresponding secondary questions.

Then, for each question qi the output ai of the LLM is collected, creating a set of outputs for each doc-ument. The textual outputs are then assigned a numerical value and transformed into a feature vector vi, through the verbalization process explained in Section 3.3.

Figure 1: Illustration of training and inference process in ICE-T. In the training phase, the process begins by generating questions to prompt an LLM, which then provides yes/no answers. These answers are verbalized and converted into numerical feature vectors. A classifier is trained using these vectors along with their respective labels. During inference, the LLM is prompted with the same questions, and the answers are similarly processed to predict outcomes using the trained classifier.

3.3 Verbalizing the answers

The output of the LLM in response to each prompt is limited to one of three possible values: Yes, No, or Unknown, depending on the answer to the question posed in the prompt. These responses are subsequently assigned numerical values for analysis, with “Yes” translating to 1, “No” to 0, and “Unknown” to 0.5.

3.4 Training a classifier

To train a classifier, we use a set V of low-dimensional numerical vectors, where |V=n+1 and corresponding labels X, where each vector vi has a corresponding binary label xi. Vectors V are obtained from the training textual data after prompting LLM to generate n + 1 outputs that are then assigned a numerical value. A classifier is then trained using a 5-fold cross-validation process and grid search for the best parameters. A choice of a specific classification algorithm will depend on the size of training data, values distribution and desired performance on a specific classification metric.

This paper is available on arxiv under CC by 4.0 Deed (Attribution 4.0 International) license.