Skip to content

Example 4-12 - PerKeyAvg for Python Incorrect #24

@funseiki

Description

@funseiki

In the example, the map method shows to take a lambda with two parameters (key and xy), but it appears as though the python version of spark only has a map method that expects a lambda with just a single parameter.

So instead of the following

r = sumCount.map(lambda key, xy: (key, xy[0]/xy[1])).collectAsMap()

We should use

 r = sumCount.map( lambda kvp: ( kvp[0], kvp[1][0] / kvp[1][1] ) ).collectAsMap()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions