示例查询 6:预测关联项目
本示例使用在数据挖掘基础教程中创建的关联模型。它演示如何创建一个预测查询,该查询告诉您应向已购买某种特定产品的客户推荐哪些产品。此种类型的查询称为单独查询,在该查询中,使用 SELECT…UNION 语句向模型提供所需的值。由于对应于新值的可预测模型列为嵌套表,因此必须使用一个 SELECT 子句将新值映射到嵌套表列 [Model],再使用一个 SELECT 子句将嵌套表列映射到事例级别列 [v Assoc Seq Line Items]。如果在该查询中添加 INCLUDE-STATISTICS 关键字,则可看到推荐的概率和支持。
SELECT PredictAssociation([Association].[vAssocSeqLineItems],INCLUDE_STATISTICS, 3)
NATURAL PREDICTION JOIN
(SELECT
(SELECT 'Classic Vest' as [Model])
AS [v Assoc Seq Line Items])
NATURAL PREDICTION JOIN
(SELECT
(SELECT 'Classic Vest' as [Model])
AS [v Assoc Seq Line Items])
AS t示例结果:
Model $SUPPORT $PROBABILITY $ADJUSTEDPROBABILITY
Sport-100
4334
0.291283
0.252696
Water Bottle
2866
0.19262
0.175205
Patch kit
2113
0.142012
0.132389
Sport-100
4334
0.291283
0.252696
Water Bottle
2866
0.19262
0.175205
Patch kit
2113
0.142012
0.132389
示例查询 7:确定相关项集的置信度
尽管规则可用于生成建议,但在对数据集内的模式的更深入分析中,项集作用更大。例如,如果对前面示例查询返回的建议不满意,则可以检查包含产品 A 的其他项集,以更好地了解是否产品 A 是人们在购买各种产品时倾向于购买的附件,或者是否产品 A 与购买特定产品有很强的关联性。浏览这些关系的最简单方法是在 Microsoft 关联查看器中筛选项集,但也可使用查询检索这些信息。
以下示例查询返回包含 Water Bottle 项目(包括单项 Water bottle)的所有项集。
SELECT TOP 100 FROM
(
SELECT FLATTENED NODE_CAPTION, NODE_SUPPORT,
(SELECT ATTRIBUTE_NAME from NODE_DISTRIBUTION
WHERE ATTRIBUTE_NAME = 'v Assoc Seq Line Items(Water Bottle)') as D
FROM Association.CONTENT
WHERE NODE_TYPE = 7
) AS Items
WHERE [D.ATTRIBUTE_NAME] <> NULL
ORDER BY NODE_SUPPORT DESC示例结果:
NODE_CAPTION NODE_SUPPORT D.ATTRIBUTE_NAME
Water Bottle = Existing
2866
v Assoc Seq Line Items(Water Bottle)
Mountain Bottle Cage = Existing, Water Bottle = Existing
1136
v Assoc Seq Line Items(Water Bottle)
Road Bottle Cage = Existing, Water Bottle = Existing
1068
v Assoc Seq Line Items(Water Bottle)
Water Bottle = Existing, Sport-100 = Existing
734
v Assoc Seq Line Items(Water Bottle)
(
SELECT FLATTENED NODE_CAPTION, NODE_SUPPORT,
(SELECT ATTRIBUTE_NAME from NODE_DISTRIBUTION
WHERE ATTRIBUTE_NAME = 'v Assoc Seq Line Items(Water Bottle)') as D
FROM Association.CONTENT
WHERE NODE_TYPE = 7
) AS Items
WHERE [D.ATTRIBUTE_NAME] <> NULL
ORDER BY NODE_SUPPORT DESC示例结果:
NODE_CAPTION NODE_SUPPORT D.ATTRIBUTE_NAME
Water Bottle = Existing
2866
v Assoc Seq Line Items(Water Bottle)
Mountain Bottle Cage = Existing, Water Bottle = Existing
1136
v Assoc Seq Line Items(Water Bottle)
Road Bottle Cage = Existing, Water Bottle = Existing
1068
v Assoc Seq Line Items(Water Bottle)
Water Bottle = Existing, Sport-100 = Existing
734
v Assoc Seq Line Items(Water Bottle)
该查询不仅返回嵌套表中符合此条件的行,还返回嵌套表外部或事例表中的所有行。因此,必须添加一个条件来消除对目标属性名称具有 Null 值的事例表行。
函数列表
所有 Microsoft 算法均支持一组通用的函数,但 Microsoft 关联算法还支持下表中列出的函数。
IsDescendant
PredictHistogram
IsInNode
PredictNodeId
PredictAdjustedProbability
PredictProbability (DMX)
PredictAssociation
PredictSupport (DMX)
PredictHistogram
IsInNode
PredictNodeId
PredictAdjustedProbability
PredictProbability (DMX)
PredictAssociation
PredictSupport (DMX)