INDEX
Explanations
terms related to the importance or significance of specific factors, especially in the context of medical, scientific, and societal discussions
New Auto-Interp
Negative Logits
renheit
-0.83
uthor
-0.79
©¶æ
-0.77
Fever
-0.76
Carbuncle
-0.75
ibal
-0.74
fred
-0.73
çİĭ
-0.73
bows
-0.71
ilk
-0.70
POSITIVE LOGITS
ingredient
1.00
importance
0.98
stone
0.90
distinction
0.85
ingred
0.85
component
0.82
components
0.81
hinge
0.79
aspect
0.78
aspects
0.78
Activations Density 1.571%