INDEX
Explanations
instances of the word "take" and its variations in different contexts
New Auto-Interp
Negative Logits
udas
-0.18
imi
-0.17
моÑĤ
-0.15
ä¼Ŀ
-0.14
etur
-0.14
irut
-0.14
ignet
-0.14
iman
-0.14
owe
-0.14
eza
-0.14
POSITIVE LOGITS
exception
0.36
offense
0.30
um
0.30
notice
0.28
offence
0.27
issue
0.26
pleasure
0.26
pains
0.25
comfort
0.24
exceptions
0.23
Activations Density 0.059%