INDEX
Explanations
references to visual or descriptive features related to shape or outline
New Auto-Interp
Negative Logits
ähr
-0.18
ebi
-0.18
odable
-0.16
eyse
-0.15
sport
-0.15
meteor
-0.15
riott
-0.14
squat
-0.14
infeld
-0.13
erap
-0.13
POSITIVE LOGITS
659
0.15
istrovstvÃŃ
0.14
ingham
0.14
ettle
0.14
986
0.14
-assets
0.13
-Ta
0.13
ously
0.13
кÑĢеÑĤ
0.13
fig
0.13
Activations Density 0.008%