INDEX
Explanations
sentences or phrases indicating certainty or emphasis, often with the phrase "in fact"
statements asserting the truth of claims or facts
New Auto-Interp
Negative Logits
Flavoring
-0.79
bye
-0.69
Crown
-0.64
Greenwood
-0.63
Citiz
-0.62
eyebrows
-0.61
hairs
-0.60
banana
-0.60
lungs
-0.59
South
-0.58
POSITIVE LOGITS
ional
1.12
netflix
0.82
REP
0.79
çī
0.74
olkien
0.73
uality
0.71
opus
0.70
Obj
0.69
akes
0.69
abetic
0.68
Activations Density 0.017%