INDEX
Explanations
references to states and their attributes or conditions
New Auto-Interp
Negative Logits
Keeper
-0.74
holder
-0.72
keeper
-0.70
owners
-0.66
owner
-0.65
owner
-0.65
keeper
-0.64
umat
-0.64
keepers
-0.64
perature
-0.64
POSITIVE LOGITS
ć
0.68
ドラ
0.65
カ
0.63
belts
0.62
cius
0.62
igible
0.61
veiled
0.61
measles
0.60
399
0.60
324
0.59
Activations Density 0.099%