INDEX
Explanations
phrases related to self-identity and introspection
instances of a specific character or symbol
New Auto-Interp
Negative Logits
ammy
-0.70
Gutenberg
-0.67
fortun
-0.66
fragmentation
-0.63
mathemat
-0.63
pressures
-0.61
disadvant
-0.59
Palestin
-0.58
fertil
-0.57
intercepted
-0.54
POSITIVE LOGITS
ï¸ı
0.88
ï¸
0.77
sure
0.71
İ
0.70
Balt
0.70
hood
0.67
----------------------------------------------------------------
0.66
hall
0.65
âģ
0.64
£
0.64
Activations Density 0.283%