Learn how to create an algorithm that can predict user behavior with artificial intelligence

Admin2 weeks ago

0 2 4 minutes read

Learn how to create an algorithm that can predict user behavior with artificial intelligence

The association aims to predict the possibility of a future or missing connection between the contract in the network. It is widely used in various applications, such as social networks, recommendations and biological networks. We will focus on association prediction in social networks, and for this we will use the same set of data used to predict the local association with DGL in the previous social network data – Twitch. This data set has a graph with its contract, which represents Twitch users and edges that represent the mutual friendship between users. We will use this to predict new links (“follow”) between users, based on the current links and user features.

As shown in the graph, the association includes multiple steps, including importing, exporting and exporting data, pre -processing, training a model, improving control devices in this, and finally preparing and ending the inferiority point that generates actual predictions.

In this post, we will focus on the first step of the process: preparing and downloading data in the Neptune Group.

Converting data into Neptune’s loader format

Initial files in the data collection look like this:

(Initial) heads:

id,days,mature,views,partner,new_id
73045350,1459,False,9528,False,2299
61573865,1629,True,3615,False,153
...

Edges (first):

from,to
6194,255
6194,980
...

To download this data to Nevtune, we first need to convert data into one of the supported formats. We will use Gremlin, so the data in CSV files should be with visions and edges, and the names of the columns should be followed in CSV files this style.

Here is what the transformed data looks at:

Heads (converted):

~id,~label,days:Int(single),mature:Bool(single),partner:Bool(single),views:Int(single)
2299,"user",1459,false,false,9528
153,"user",1629,true,false,3615
...

The edges (converted):

~from,~to,~label,~id
6194,255,"follows",0
255,6194,"follows",1
...

This is the symbol that converts the files available with the data set into the format supported by the Neptune Loader:

import pandas as pd

# === Vertices ===

# load vertices from the CSV file provided in the dataset
vertices_df = pd.read_csv('./musae_ENGB_target.csv')

# drop old ID column, we'll use the new IDs only
vertices_df.drop('id', axis=1, inplace=True)

# rename columns for Neptune Bulk Loader: 
#   add ~ to the id column,
#   add data types and cardinality to vertex property columns
vertices_df.rename(
    columns={
        'new_id': '~id', 
        'days': 'days:Int(single)',
        'mature': 'mature:Bool(single)',
        'views': 'views:Int(single)',
        'partner': 'partner:Bool(single)',
    },
    inplace=True,
)

# add vertex label column
vertices_df['~label'] = 'user'

# save vertices to a file, ignore the index column
vertices_df.to_csv('vertices.csv', index=False)

# === Edges ===

# load edges from the CSV file provided in the dataset
edges_df = pd.read_csv('./musae_ENGB_edges.csv')

# add reverse edges (the original edges represent mutual follows)
reverse_edges_df = edges_df[['to', 'from']]
reverse_edges_df.rename(columns={'from': 'to', 'to': 'from'}, inplace=True)
reverse_edges_df.head()
edges_df = pd.concat([edges_df, reverse_edges_df], ignore_index=True)

# rename columns according to Neptune Bulk Loader format:
# add ~ to 'from' and 'to' column names
edges_df.rename(columns={
        'from': '~from',
        'to': '~to',
    },
    inplace=True,
)
edges_df.head()

# add edge label column
edges_df['~label'] = 'follows'

# add edge IDs
edges_df['~id'] = range(len(edges_df))

# save edges to a file, ignore the index column
edges_df.to_csv('edges.csv', index=False)

Allow the data access to the Neptune DB in S3: the role of IAM and the end of the VPC

After converting the files, we will download them to the S3. In order to do this, we first need to create a bucket that contains our heads. We also need to create the role of IAM that allows access to the S3 bucket (in the attached policy) and has a confidence policy that allows Neptune to take over (see screenshot).

The role we will add to the Neptune Group. The permissibility policy allows access to the S3 bucket where we have our heads. Note that in confidence policy, there is RDS.AMAZONAWS.com.

We will add the role to our NepTune (using the Neptune Control Unit), then we will wait until it becomes active (or restart the group).

Add the role of IAM to the Neptune Group.

We need to wait until the role becomes

We also need to allow the movement of the network from Neptune to S3, and we do so, we need the end of the VPC Gate end of the S3 in VPC:

Create the end of the VPC portal to allow Neptune Access S3.

Download data

We are now ready to start downloading our data. To do this, we need to call the block applications interface from inside the VPC to and create two functions for download: one for Vertices.csv operations, and another for Therges.csv. API calls are identical, only the S3 object key. You should allow VPC formation and safety groups to pass through the counterpart you are running curl To the Neptune Group.

curl -XPOST \
    -H 'Content-Type: application/json' \
     -d '
    {
      "source" : "s3://bucket-name/vertices.csv",
      "format" : "csv",
      "iamRoleArn" : "arn:aws:iam::account-id:role/role-name",
      "region" : "us-east-1",
      "failOnError" : "TRUE",
      "parallelism" : "HIGH",
      "updateSingleCardinalityProperties" : "FALSE"
    }'

The loader applications interface responds with JSON containing functionality (” ‘Loadid‘):

{
    "status" : "200 OK",
    "payload" : {
        "loadId" : "your-load-id"
    }
}

You can check if the download has been completed using this application programming interface:

curl -XGET /your-load-id

He responds to this:

{
    "status" : "200 OK",
    "payload" : {
        "feedCount" : [
            {
                "LOAD_COMPLETED" : 1
            }
        ],
        "overallStatus" : {
            "fullUri" : "s3://bucket-name/vertices.csv",
            "runNumber" : 1,
            "retryNumber" : 1,
            "status" : "LOAD_COMPLETED",
            "totalTimeSpent" : 8,
            "startTime" : 1,
            "totalRecords" : 35630,
            "totalDuplicates" : 0,
            "parsingErrors" : 0,
            "datatypeMismatchErrors" : 0,
            "insertErrors" : 0
        }
    }

Once The heads are loaded from the peaks. CSVWe can Loads download Using the same application programming interface. To do this, we just replace the peaks Edges And the operation of the first curl Driving again.

Check the loaded data

When downloading functions, we can access the loaded data by sending Gremlin Information to the Nevtune collection. To run these queries, we can either call Neptune with a Gremlin Control Unit or use the Neptune / Sagemaker notebook. We will use the Sagemaker notebook that can be created either with the Nevtune collection or later adding it when the group already runs.

This is the query that gets the number of summits that we created:

%%gremlin

g.V().count()

You can also get top by ID and check that its properties have been loaded properly with:

%%gremlin

g.V('some-vertex-id').elementMap()

Get a number of heads in the graph and get a single -top chapter map with an identifier in the Sagemaker / Nevtune notebook.

After downloading the edges, you can check their download successfully

%%gremlin

g.E().count()

and

%%gremlin

g.E('0').elementMap()

Get the number of edges in the graph and get one edge component map with an identifier in the Sagemaker / Nevtune notebook.

This concludes the data download part of the process. In the next post, we will consider the export of data from Neptune in format that can be used in the ML model training.

Admin2 weeks ago

0 2 4 minutes read

Keanu Reeves says: “Constantine”, ready for the text, says,

First class passengers, the strangest requests, according to flight attendants

Musk Dit Que Les Fonctionnaires Devront Rendre Compte de Leur Activité Récente ou Démissionner

Reset Bitcoin Asopr to 1.01 – Here is why you can provoke a crowd?

The CEO of Polygon Labs believes that Bitcoin reaches 250 thousand dollars, citing its simplicity and scarcity

Anora’s soul award is Best Director, Sean Baker screams, “Enzy’s movie is struggling more than ever”

The Trump team pushes Mexico towards the customs tariffs on Chinese imports

Billionaire Warren Buffett Amasses record 334,000,000 dollars in Berkchire Hathaway after throwing $ 5,500,000,000 from exposure to Bank of America

The Vatican Pope Francis says in a critical health condition

Spirit Awards 2025 List of winners

Learn how to create an algorithm that can predict user behavior with artificial intelligence

Converting data into Neptune’s loader format

Allow the data access to the Neptune DB in S3: the role of IAM and the end of the VPC

Download data

Check the loaded data

Like this:

Related

Admin

Leave a Reply Cancel reply

Converting data into Neptune’s loader format

Allow the data access to the Neptune DB in S3: the role of IAM and the end of the VPC

Download data

Check the loaded data

Share this:

Like this:

Related

Related Articles

Leave a Reply Cancel reply

Adblock Detected