INDEX
Explanations
affirmative responses
affirmations or confirmations regarding various topics
New Auto-Interp
Negative Logits
bage
-0.87
recated
-0.70
Offline
-0.67
ILCS
-0.67
leted
-0.66
RAW
-0.65
rarily
-0.65
ufact
-0.61
killer
-0.61
cheon
-0.60
POSITIVE LOGITS
terday
1.72
YES
0.78
sir
0.76
yne
0.73
yes
0.72
sis
0.69
matter
0.69
eed
0.68
ñ
0.67
Deal
0.66
Activations Density 0.020%