INDEX
Explanations
names of specific individuals
occurrences of the word "al" and its variations, indicating a focus on certain patterns or suffixes
New Auto-Interp
Negative Logits
*/(
-0.89
soDeliveryDate
-0.84
ecause
-0.83
Committees
-0.74
artments
-0.69
enegger
-0.69
UID
-0.68
akeru
-0.67
ascript
-0.64
selves
-0.64
POSITIVE LOGITS
°
0.73
Ŀ
0.70
Ĭ
0.66
Ö¼
0.64
prom
0.63
ê
0.63
ili
0.63
®
0.62
iscons
0.61
edom
0.60
Activations Density 0.257%