INDEX
Explanations
gold health contact understanding resistance
New Auto-Interp
Negative Logits
tend
0.47
Tend
0.44
tend
0.43
Tend
0.42
wiring
0.42
transport
0.39
transport
0.38
뷸
0.36
Transport
0.36
ይመ
0.36
POSITIVE LOGITS
Gar
0.50
gar
0.49
Gar
0.46
Garo
0.42
Garcia
0.41
tràn
0.39
Garcia
0.38
Ashley
0.38
GAR
0.38
Robb
0.38
Activations Density 0.000%