INDEX
Explanations
phrases expressing beliefs, thoughts, and desires
expressions of belief or perception regarding knowledge and expectations
New Auto-Interp
Negative Logits
Paige
-0.72
Noir
-0.70
Mir
-0.70
Miranda
-0.70
pa
-0.67
235
-0.67
-0.67
æľ
-0.67
Pryor
-0.66
ranch
-0.65
POSITIVE LOGITS
we
0.92
adi
0.81
WE
0.80
Twe
0.80
Twe
0.79
^
0.79
df
0.76
Ulster
0.75
Weld
0.75
Webb
0.74
Activations Density 0.223%