INDEX
Explanations
phrases that indicate a specific action or sentiment, often related to personal experiences or opinions
New Auto-Interp
Negative Logits
Glou
-0.68
anwhile
-0.68
Ambro
-0.67
Skydragon
-0.65
Thomson
-0.65
Sleeping
-0.64
Bened
-0.63
Ket
-0.62
incent
-0.62
Simpl
-0.61
POSITIVE LOGITS
¬
1.29
Ĵ
1.15
ħ
1.11
ĸ
1.10
ı
1.10
į
1.08
Ļ
1.06
Ķ
1.04
¯
1.03
ĩ
1.02
Activations Density 0.223%