當前位置：首頁 > 编程资源 > 编程问答 >内容正文

编程问答

ML之LightGBM：基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分)

發(fā)布時間：2025/3/21 编程问答 21 豆豆

生活随笔收集整理的這篇文章主要介紹了 ML之LightGBM：基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分) 小編覺得挺不錯的,現(xiàn)在分享給大家,幫大家做個參考.

ML之LightGBM：基于titanic數(shù)據(jù)集利用LightGBM和shap算法實現(xiàn)數(shù)據(jù)特征的可解釋性(量化特征對模型貢獻度得分)

基于titanic數(shù)據(jù)集利用LightGBM和shap算法實現(xiàn)數(shù)據(jù)特征的可解釋性(量化特征對模型貢獻度得分)

設(shè)計思路

輸出結(jié)果

核心代碼

相關(guān)文章
ML之LightGBM：基于titanic數(shù)據(jù)集利用LightGBM和shap算法實現(xiàn)數(shù)據(jù)特征的可解釋性(量化特征對模型貢獻度得分)
ML之LightGBM：基于titanic數(shù)據(jù)集利用LightGBM和shap算法實現(xiàn)數(shù)據(jù)特征的可解釋性(量化特征對模型貢獻度得分)實現(xiàn)

基于titanic數(shù)據(jù)集利用LightGBM和shap算法實現(xiàn)數(shù)據(jù)特征的可解釋性(量化特征對模型貢獻度得分)

設(shè)計思路

更新……

輸出結(jié)果

核心代碼

# flake8: noqaimport warnings import sys__version__ = '0.37.0'# check python version if (sys.version_info < (3, 0)):warnings.warn("As of version 0.29.0 shap only supports Python 3 (not 2)!")from ._explanation import Explanation, Cohorts# explainers from .explainers._explainer import Explainer from .explainers._kernel import Kernel as KernelExplainer from .explainers._sampling import Sampling as SamplingExplainer from .explainers._tree import Tree as TreeExplainer from .explainers._deep import Deep as DeepExplainer from .explainers._gradient import Gradient as GradientExplainer from .explainers._linear import Linear as LinearExplainer from .explainers._partition import Partition as PartitionExplainer from .explainers._permutation import Permutation as PermutationExplainer from .explainers._additive import Additive as AdditiveExplainer from .explainers import other# plotting (only loaded if matplotlib is present) def unsupported(*args, **kwargs):warnings.warn("matplotlib is not installed so plotting is not available! Run `pip install matplotlib` to fix this.")try:import matplotlibhave_matplotlib = True except ImportError:have_matplotlib = False if have_matplotlib:from .plots._beeswarm import summary_legacy as summary_plotfrom .plots._decision import decision as decision_plot, multioutput_decision as multioutput_decision_plotfrom .plots._scatter import dependence_legacy as dependence_plotfrom .plots._force import force as force_plot, initjs, save_html, getjsfrom .plots._image import image as image_plotfrom .plots._monitoring import monitoring as monitoring_plotfrom .plots._embedding import embedding as embedding_plotfrom .plots._partial_dependence import partial_dependence as partial_dependence_plotfrom .plots._bar import bar_legacy as bar_plotfrom .plots._waterfall import waterfall as waterfall_plotfrom .plots._group_difference import group_difference as group_difference_plotfrom .plots._text import text as text_plot else:summary_plot = unsupporteddecision_plot = unsupportedmultioutput_decision_plot = unsupporteddependence_plot = unsupportedforce_plot = unsupportedinitjs = unsupportedsave_html = unsupportedimage_plot = unsupportedmonitoring_plot = unsupportedembedding_plot = unsupportedpartial_dependence_plot = unsupportedbar_plot = unsupportedwaterfall_plot = unsupportedtext_plot = unsupported# other stuff :) from . import datasets from . import utils from . import links#from . import benchmarkfrom .utils._legacy import kmeans from .utils import sample, approximate_interactions# TODO: Add support for hclustering based explanations where we sort the leaf order by magnitude and then show the dendrogram to the left def summary_legacy(shap_values, features=None, feature_names=None, max_display=None, plot_type=None,color=None, axis_color="#333333", title=None, alpha=1, show=True, sort=True,color_bar=True, plot_size="auto", layered_violin_max_num_bins=20, class_names=None,class_inds=None,color_bar_label=labels["FEATURE_VALUE"],cmap=colors.red_blue,# depreciatedauto_size_plot=None,use_log_scale=False):"""Create a SHAP beeswarm plot, colored by feature values when they are provided.Parameters----------shap_values : numpy.arrayFor single output explanations this is a matrix of SHAP values (# samples x # features).For multi-output explanations this is a list of such matrices of SHAP values.features : numpy.array or pandas.DataFrame or listMatrix of feature values (# samples x # features) or a feature_names list as shorthandfeature_names : listNames of the features (length # features)max_display : intHow many top features to include in the plot (default is 20, or 7 for interaction plots)plot_type : "dot" (default for single output), "bar" (default for multi-output), "violin",or "compact_dot".What type of summary plot to produce. Note that "compact_dot" is only used forSHAP interaction values.plot_size : "auto" (default), float, (float, float), or NoneWhat size to make the plot. By default the size is auto-scaled based on the number offeatures that are being displayed. Passing a single float will cause each row to be that many inches high. Passing a pair of floats will scale the plot by thatnumber of inches. If None is passed then the size of the current figure will be leftunchanged."""# support passing an explanation objectif str(type(shap_values)).endswith("Explanation'>"):shap_exp = shap_valuesbase_value = shap_exp.base_valueshap_values = shap_exp.valuesif features is None:features = shap_exp.dataif feature_names is None:feature_names = shap_exp.feature_names# if out_names is None: # TODO: waiting for slicer support of this# out_names = shap_exp.output_names# deprecation warningsif auto_size_plot is not None:warnings.warn("auto_size_plot=False is deprecated and is now ignored! Use plot_size=None instead.")multi_class = Falseif isinstance(shap_values, list):multi_class = Trueif plot_type is None:plot_type = "bar" # default for multi-output explanationsassert plot_type == "bar", "Only plot_type = 'bar' is supported for multi-output explanations!"else:if plot_type is None:plot_type = "dot" # default for single output explanationsassert len(shap_values.shape) != 1, "Summary plots need a matrix of shap_values, not a vector."# default color:if color is None:if plot_type == 'layered_violin':color = "coolwarm"elif multi_class:color = lambda i: colors.red_blue_circle(i/len(shap_values))else:color = colors.blue_rgb

《新程序員》：云原生和全面數(shù)字化實踐50位技術(shù)專家共同創(chuàng)作，文字、視頻、音頻交互閱讀

總結(jié)

以上是生活随笔為你收集整理的ML之LightGBM：基于titanic数据集利用LightGBM和shap算法实现数据特征的可解释性(量化特征对模型贡献度得分)的全部內(nèi)容，希望文章能夠幫你解決所遇到的問題。

如果覺得生活随笔網(wǎng)站內(nèi)容還不錯，歡迎將生活随笔推薦給好友。

上一篇： CV：计算机视觉基础之图像存储到计算机的
下一篇：成功解决matplotlib绘图的时候横