INDEX
Explanations
references to opposing sides or parties in a discussion or argument
New Auto-Interp
Negative Logits
ooth
-0.16
ilig
-0.15
ErrorException
-0.15
582
-0.15
ucch
-0.15
çŃĭ
-0.14
traction
-0.14
ัà¸Ķส
-0.14
unas
-0.14
indr
-0.14
POSITIVE LOGITS
coin
0.42
aisle
0.37
coin
0.34
equation
0.31
Coin
0.31
fence
0.29
coins
0.28
divide
0.28
Coin
0.28
ledger
0.25
Activations Density 0.028%