INDEX
Explanations
phrases indicating conditional situations or hypothetical scenarios
New Auto-Interp
Negative Logits
AccessorTable
-0.66
Contours
-0.54
enderror
-0.49
الدراسه
-0.48
justicia
-0.45
Зноскі
-0.45
يميديا
-0.44
rungsseite
-0.44
AMPTON
-0.42
fjor
-0.42
POSITIVE LOGITS
myſelf
0.77
ſch
0.75
Diſ
0.75
Perſ
0.73
Monfieur
0.71
ISNI
0.70
ſtre
0.69
Majefty
0.69
Inſ
0.68
Theſe
0.68
Activations Density 0.036%