INDEX
Explanations
terms related to scoring and evaluations
New Auto-Interp
Negative Logits
ekil
-0.17
igm
-0.17
ector
-0.17
rud
-0.16
-webpack
-0.15
oral
-0.15
orp
-0.15
shade
-0.15
dehyde
-0.15
rem
-0.15
POSITIVE LOGITS
card
0.35
cards
0.31
boards
0.24
keeper
0.20
keeping
0.20
occo
0.20
CARD
0.19
board
0.19
agli
0.18
keepers
0.18
Activations Density 0.015%