One of the things that my own forum (https://hub.phenomlab.com) has been doing successfully for a few months is querying RSS feeds, then using the Flarum API to create discussions as posts
Want this ? Sure you do ! Below are the steps, including all scripts etc to make this work
Firstly, you will need the flarum api client
from here https://github.com/flagrow/flarum-api-client#configuration
Installation
composer require flagrow/flarum-api-client
Configuration
In order to start working with the client you might need a Flarum master key:
- Generate a 40 character random, unguessable string, this is the Token needed for this package.
- Manually add it to the api_keys table using phpmyadmin/adminer or another solution.
The master key is required to access non-public discussions and running actions otherwise reserved for Flarum administrators.
Install SimplePie
Next, install SimplePie
to parse the RSS feeds
composer require simplepie/simplepie
Create storage DB
Now access your database using phpmyadmin (or something similar) and create a new database called "feed"
With the database created, run the following script which will create a table called "queue" with a few simple columns
CREATE TABLE `queue` (
`id` bigint(20) NOT NULL,
`url` varchar(500) NOT NULL,
`title` varchar(500) NOT NULL,
`seen` int(1) NOT NULL DEFAULT 0
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
As your "feed" database gets bigger, it'll need some form of index to make it simpler and faster to search. Create as follows in phpmyadmin
ALTER TABLE `queue`
ADD PRIMARY KEY (`id`),
ADD KEY `title` (`title`),
ADD KEY `url` (`url`);
Finally, we'll set an AUTO INCREMENT on the ID field of the table
ALTER TABLE `queue`
MODIFY `id` bigint(20) NOT NULL AUTO_INCREMENT, AUTO_INCREMENT=1;
COMMIT;
Create credentials file
For security reasons, we "include" a details.php file (you can call this whatever you like - just remember to reflect any change of name in the below main script) outside of the web root. We are going to be running this from PHP-CLI anyway, so it shouldn't be exposed
details.php
in my case is being included like the below - it's located at the root of my domain, but outside of the web root
include("/var/www/vhosts/phenomlab.com/details.php");
Your details.php
file should contain this
<?php
$header = array(
"Authorization: Token the token you generated in the first step",
"Content-Type: application/json"
);
// Create DB connection
$servername = "localhost";
$login = "yourdblogin";
$dbpw = "yourdbpassword";
$dbname = "feed";
$conn = new mysqli($servername, $login, $dbpw, $dbname);
// Check connection
if ($conn->connect_error) {
die("Connection failed: " . $conn->connect_error);
}
else
{
echo "Connected to database\n";
}
?>
Create the RSS parser script
Create a new PHP file called rssparser.php
- again, located outside of the web root
<?php
include("/path/to/your/details.php");
require 'vendor/autoload.php';
$site = array(
"yourrssfeedurlgoeshere"
);
foreach ($site as $url) {
echo "\nProcessing RSS feed " . $url . "\n\n";
$feed = new SimplePie();
$feed->enable_cache();
$feed->set_cache_location("/path/to/your/chosen/cache/directory");
$feed->set_feed_url($url);
$feed->init();
$items = 10;
for ($i = 0; $i < $items; $i++) {
$item = $feed->get_item($i);
$description = str_replace("View Entire Post ›", "", $item->get_description());
$description = str_replace("<img", "\n\n<img", $item->get_description());
$description = str_replace('<img src="', '', $item->get_description());
$description = str_replace('" />', '', $item->get_description());
$description = strip_tags(html_entity_decode($item->get_description()), "<img>") . "\n";
$description .= "\n" . '[Link to original article](' . $item->get_link() . ')' . "\n\n";
$content = $item->get_content(true);
// Define variables for use later on in the script
$subject = $item->get_title();
$body = trim($description);
$link = $item->get_link();
// Query the database for each item. Perform action based on results
$stmt = $conn->prepare('SELECT url, seen FROM queue WHERE url = ?');
$stmt->bind_param('s', $link);
$stmt->execute();
$stmt->store_result();
$stmt->bind_result($checklink, $seen);
$stmt->fetch();
// Test to see if we have processed these before. If we have, skip them to avoid duplicates
if (!$checklink || !$seen) {
echo "Checking " . $link . " \nThis does not exist...processing\n";
// Processing new items. Insert record into database to prevent duplication on subsequent processing runs
$seen = 1;
$stmt = $conn->prepare('INSERT INTO queue (url, title, seen) VALUES(?, ?, ?)');
$stmt->bind_param("ssi", $link, $subject, $seen);
$stmt->execute();
// Process each newly identified unique post into Flarum using the API
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, 'https://yourflarumurl/api/discussions');
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, true);
curl_setopt($ch, CURLOPT_HTTPHEADER, $header);
curl_setopt($ch, CURLOPT_POST, 22);
curl_setopt($ch, CURLOPT_POSTFIELDS, json_encode((array(
'data' => array(
'type' => "discussions",
'attributes' => array(
'title' => "$subject",
'content' => "$body"
),
'relationships' => array(
'tags' => array(
'data' => array(
array(
'type' => 'tags',
'id' => "23"
)
)
)
)
)
))));
curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 0);
$result = curl_exec($ch);
echo $result;
}
// Item has already been processed. Continue loop until count exhausted
else {
echo "Checking " . $checklink . " \nThis has already been processed...skipping\n";
}
}
}
Important notes
$items = 10;
is the number of RSS items that the script will parse for each resource URL
curl_setopt($ch, CURLOPT_POST, 22);
- "22" in this case is the ID of the user I want to post as. This user needs admin rights.
array(
'type' => 'tags',
'id' => "23"
)
This array tells the Flarum API in which tag to post. In this case, "23" is the ID of the "news" tag.
Test it !
To test your script to ensure it's working, run from the CLI and the working directory of where your files are located
php rssparser.php
Watch for the output on the screen. The first time this is run, the script will create posts for all new RSS feeds it has no reference for. Whilst each post item is created, the "feed" database is populated so that subsequent runs are not duplicated.
Now what ?
I have this rssparser.php
scheduled to run every hour.
Enjoy - let me know if you have any issues getting this to work.