INDEX
Explanations
numerical values in different contexts, potentially related to measurements or metrics
expressions of high emotional states or intensity
New Auto-Interp
Negative Logits
heit
-0.73
rejo
-0.66
pil
-0.66
bol
-0.62
pursued
-0.61
affili
-0.61
porch
-0.61
citizenship
-0.61
Hispan
-0.61
revis
-0.60
POSITIVE LOGITS
Therefore
1.01
However
0.87
Especially
0.84
Its
0.81
BUT
0.81
Furthermore
0.78
Moreover
0.77
Until
0.76
Imagine
0.75
Specifically
0.74
Activations Density 0.494%