INDEX
Explanations
instances of the verb "make" and its variations related to assertions or claims
New Auto-Interp
Negative Logits
imetype
-0.15
οκ
-0.15
rud
-0.15
iks
-0.15
ucher
-0.15
太éĺ³åŁİ
-0.15
ãģļ
-0.14
agi
-0.14
ije
-0.14
omo
-0.14
POSITIVE LOGITS
fun
0.30
clear
0.30
reference
0.27
fun
0.24
Fun
0.23
clear
0.21
abund
0.21
Fun
0.21
light
0.21
reference
0.21
Activations Density 0.060%