INDEX
Explanations
mentions of specific amounts or values
instances of the character 'ľ'
New Auto-Interp
Negative Logits
manif
-0.84
incorpor
-0.83
tyres
-0.72
cones
-0.71
neighb
-0.71
slic
-0.69
levers
-0.69
stunts
-0.69
electrodes
-0.68
psychedel
-0.68
POSITIVE LOGITS
ï¸ı
1.28
âĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢâĶĢ
1.27
âĶĢâĶĢ
1.26
âĶĢâĶĢâĶĢâĶĢ
1.26
âĿ
1.01
âĸ¬âĸ¬
0.91
âľ
0.91
âĹ
0.91
STER
0.83
----------------------------------------------------------------
0.83
Activations Density 0.108%